diff --git a/README.md b/README.md index 8ff6a6f..60d2355 100644 --- a/README.md +++ b/README.md @@ -87,8 +87,8 @@ The main entry point is the `protein-quest` command line tool which has multiple subcommands to perform actions. To use programmaticly, see the -[Jupyter notebooks](https://www.bonvinlab.org/protein-quest/notebooks) and -[API documentation](https://www.bonvinlab.org/protein-quest/autoapi/protein_quest/). +[Jupyter notebooks](https://www.bonvinlab.org/protein-quest/notebooks/README.html), +and [API documentation](https://www.bonvinlab.org/protein-quest/autoapi/protein_quest/). While downloading or copying files it uses a global cache (located at `~/.cache/protein-quest`) and hardlinks to save disk space and improve speed. diff --git a/docs/notebooks/README.md b/docs/notebooks/README.md new file mode 100644 index 0000000..bd7b6f6 --- /dev/null +++ b/docs/notebooks/README.md @@ -0,0 +1,67 @@ +# Notebooks + +Jupyter notebooks show how to use protein-quest through its Python API and can be +run locally or in cloud notebook environments. + +## Avalable notebooks + +| Notebook | What you will do | +| --- | --- | +| [Search UniProt](uniprot.ipynb) | Find UniProt accessions and map them to PDB, AlphaFold, EMDB, and partner datasets. | +| [AlphaFold](alphafold.ipynb) | Download AlphaFold models, filter on confidence, and visualize structures with Mol*. | +| [PDBe](pdbe.ipynb) | Download PDBe structures, extract single chains, and visualize structures with Mol*. | + +## Launch in cloud environments + +Use the links below to open each notebook quickly. + +| Notebook | Google Colab | notebooks.egi.eu | Binder | nbgitpuller | +| --- | --- | --- | --- | --- | +| Search UniProt | [Open](https://colab.research.google.com/github/haddocking/protein-quest/blob/main/docs/notebooks/uniprot.ipynb) | [Open hub](https://notebooks.egi.eu/hub/) | [Open](https://mybinder.org/v2/gh/haddocking/protein-quest/HEAD?urlpath=lab/tree/docs/notebooks/uniprot.ipynb) | [Generate link](https://nbgitpuller.readthedocs.io/en/latest/link.html?hub=https://notebooks.egi.eu&repo=https://github.com/haddocking/protein-quest&branch=main) | +| AlphaFold | [Open](https://colab.research.google.com/github/haddocking/protein-quest/blob/main/docs/notebooks/alphafold.ipynb) | [Open hub](https://notebooks.egi.eu/hub/) | [Open](https://mybinder.org/v2/gh/haddocking/protein-quest/HEAD?urlpath=lab/tree/docs/notebooks/alphafold.ipynb) | [Generate link](https://nbgitpuller.readthedocs.io/en/latest/link.html?hub=https://notebooks.egi.eu&repo=https://github.com/haddocking/protein-quest&branch=main) | +| PDBe | [Open](https://colab.research.google.com/github/haddocking/protein-quest/blob/main/docs/notebooks/pdbe.ipynb) | [Open hub](https://notebooks.egi.eu/hub/) | [Open](https://mybinder.org/v2/gh/haddocking/protein-quest/HEAD?urlpath=lab/tree/docs/notebooks/pdbe.ipynb) | [Generate link](https://nbgitpuller.readthedocs.io/en/latest/link.html?hub=https://notebooks.egi.eu&repo=https://github.com/haddocking/protein-quest&branch=main) | + + + +## Notes + +- Google Colab and Binder open notebooks directly from this repository. +- notebooks.egi.eu requires sign-in and VO enrollment before use. +- nbgitpuller links depend on a JupyterHub where nbgitpuller is installed. + +This section explains how to run notebooks locally. + +## Local setup + +1. Install Jupyter. + +```bash +python -m pip install jupyterlab +``` + +2. Install notebook dependencies. + +For the released package: + +```bash +python -m pip install protein-quest[nb] +``` +(The `[nb]` extra installs `ipymolstar` for structure visualization in the AlphaFold and PDBe notebooks.) + +3. Start Jupyter and open a notebook. + +```bash +jupyter lab +``` + +Then open one of the notebooks (*.ipynb files). + +## Runtime dependencies + +The first code cell in each notebook installs required packages, including +`protein-quest` and `ipymolstar`. + +- Google Colab: best for quick exploration. +- Binder: no local install needed, startup can take a few minutes. +- notebooks.egi.eu: requires EGI account and VO enrollment. +- nbgitpuller: requires a JupyterHub deployment with nbgitpuller installed. diff --git a/docs/notebooks/alphafold.ipynb b/docs/notebooks/alphafold.ipynb index b098bf9..747f4df 100644 --- a/docs/notebooks/alphafold.ipynb +++ b/docs/notebooks/alphafold.ipynb @@ -1,5 +1,27 @@ { "cells": [ + { + "cell_type": "markdown", + "id": "20eedeb9", + "metadata": {}, + "source": [ + "## Environment setup\n", + "\n", + "Run the next cell once per fresh kernel to install notebook dependencies.\n", + "If you install packages in the active kernel, restart the kernel and rerun all cells." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0baea19c", + "metadata": {}, + "outputs": [], + "source": [ + "# Cloud and local notebooks: install required runtime dependencies.\n", + "!pip install -q protein-quest[nb]" + ] + }, { "cell_type": "markdown", "id": "24b1926c", @@ -12,7 +34,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "id": "681ba946", "metadata": {}, "outputs": [], @@ -39,7 +61,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 3, "id": "81e449db", "metadata": {}, "outputs": [], @@ -49,7 +71,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 4, "id": "5c2e6ee3", "metadata": {}, "outputs": [], @@ -67,7 +89,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 7, "id": "e32b474a", "metadata": {}, "outputs": [ @@ -75,8 +97,8 @@ "name": "stderr", "output_type": "stream", "text": [ - "Fetching Alphafold summaries: 100%|██████████| 3/3 [00:00<00:00, 553.10it/s]\n", - "Downloading AlphaFold files: 100%|██████████| 6/6 [00:00<00:00, 38245.93it/s]" + "Fetching Alphafold summaries: 100%|██████████| 3/3 [00:00<00:00, 12.32it/s]\n", + "Downloading AlphaFold files: 100%|██████████| 6/6 [00:00<00:00, 34.79it/s]" ] }, { @@ -101,7 +123,7 @@ " pdbUrl=URL('https://alphafold.ebi.ac.uk/files/AF-A1YPR0-F1-model_v6.pdb'),\n", " providerId='GDM',\n", " sequence='MANDIDELIGIPFPNHSSEVLCSLNEQRHDGLLCDVLLVVQEQEYRTHRSVLAACSKYFKKLFTAGTLASQPYVYEIDFVQPEALAAILEFAYTSTLTITAGNVKHILNAARMLEIQCIVNVCLEIMEPGGDGGEEDDKEDDDDDEDDDDEEDEEEEEEEEEDDDDDTEDFADQENLPDPQDISCHQSPSKTDHLTEKAYSDTPRDFPDSFQAGSPGHLGVIRDFSIESLLRENLYPKANIPDRRPSLSPFAPDFFPHLWPGDFGAFAQLPEQPMDSGPLDLVIKNRKIKEEEKEELPPPPPPPFPNDFFKDMFPDLPGGPLGPIKAENDYGAYLNFLSATHLGGLFPPWPLVEERKLKPKASQQCPICHKVIMGAGKLPRHMRTHTGEKPYMCTICEVRFTRQDKLKIHMRKHTGERPYLCIHCNAKFVHNYDLKNHMRIHTGVRPYQCEFCYKSFTRSDHLHRHIKRQSCRMARPRRGRKPAAWRAASLLFGPGGPAPDKAAFVMPPALGEVGGHLGGAAVCLPGPSPAKHFLAAPKGALSLQELERQFEETQMKLFGRAQLEAERNAGGLLAFALAENVAAARPYFPLPDPWAAGLAGLPGLAGLNHVASMSEANN',\n", - " sequenceChecksum='73D82A34502B55BF',\n", + " sequenceChecksum='455da6445b69ec9853216a00638d635b',\n", " sequenceEnd=619,\n", " sequenceStart=1,\n", " sequenceVersionDate='2007-02-06T00:00:00Z',\n", @@ -163,7 +185,7 @@ " pdbUrl=URL('https://alphafold.ebi.ac.uk/files/AF-O60481-F1-model_v6.pdb'),\n", " providerId='GDM',\n", " sequence='MTMLLDGGPQFPGLGVGSFGAPRHHEMPNREPAGMGLNPFGDSTHAAAAAAAAAAFKLSPAAAHDLSSGQSSAFTPQGSGYANALGHHHHHHHHHHHTSQVPSYGGAASAAFNSTREFLFRQRSSGLSEAASGGGQHGLFAGSASSLHAPAGIPEPPSYLLFPGLHEQGAGHPSPTGHVDNNQVHLGLRGELFGRADPYRPVASPRTDPYAAGAQFPNYSPMNMNMGVNVAAHHGPGAFFRYMRQPIKQELSCKWIDEAQLSRPKKSCDRTFSTMHELVTHVTMEHVGGPEQNNHVCYWEECPREGKSFKAKYKLVNHIRVHTGEKPFPCPFPGCGKIFARSENLKIHKRTHTGEKPFKCEFEGCDRRFANSSDRKKHMHVHTSDKPYICKVCDKSYTHPSSLRKHMKVHESQGSDSSPAASSGYESSTPPAIASANSKDTTKTPSAVQTSTSHNPGLPPNFNEWYV',\n", - " sequenceChecksum='3150CF13C0679568',\n", + " sequenceChecksum='b2daa0c0f3120f23c7fe601510d1082e',\n", " sequenceEnd=467,\n", " sequenceStart=1,\n", " sequenceVersionDate='1998-08-01T00:00:00Z',\n", @@ -224,7 +246,7 @@ " pdbUrl=URL('https://alphafold.ebi.ac.uk/files/AF-P50613-F1-model_v6.pdb'),\n", " providerId='GDM',\n", " sequence='MALDVKSRAKRYEKLDFLGEGQFATVYKARDKNTNQIVAIKKIKLGHRSEAKDGINRTALREIKLLQELSHPNIIGLLDAFGHKSNISLVFDFMETDLEVIIKDNSLVLTPSHIKAYMLMTLQGLEYLHQHWILHRDLKPNNLLLDENGVLKLADFGLAKSFGSPNRAYTHQVVTRWYRAPELLFGARMYGVGVDMWAVGCILAELLLRVPFLPGDSDLDQLTRIFETLGTPTEEQWPDMCSLPDYVTFKSFPGIPLHHIFSAAGDDLLDLIQGLFLFNPCARITATQALKMKYFSNRPGPTPGCQLPRPNCPVETLKEQSNPALAIKRKRTEALEQGGLPKKLIF',\n", - " sequenceChecksum='0A94BFA7DD416CEB',\n", + " sequenceChecksum='efac0ba2abc8f0b14c6b6689a0f2d676',\n", " sequenceEnd=346,\n", " sequenceStart=1,\n", " sequenceVersionDate='1996-10-01T00:00:00Z',\n", @@ -279,14 +301,14 @@ ], "source": [ "summaries = [\n", - " s async for s in fetch_many_async([\"A1YPR0\", \"O60481\", \"P50613\"], save_dir, what={\"summary\", \"cif\", \"paeDoc\"})\n", + " s async for s in fetch_many_async([\"A1YPR0\", \"O60481\", \"P50613\"], save_dir, formats={\"summary\", \"cif\", \"paeDoc\"})\n", "]\n", "pprint(summaries)" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 8, "id": "2d3595e6", "metadata": {}, "outputs": [ @@ -294,12 +316,10 @@ "name": "stdout", "output_type": "stream", "text": [ - "total 4.3M\n", + "total 3.3M\n", "4.0K A1YPR0.json\n", "556K AF-A1YPR0-F1-model_v6.cif\n", "1.1M AF-A1YPR0-F1-predicted_aligned_error_v6.json\n", - "412K AF-O60481-2-F1-model_v6.cif\n", - "600K AF-O60481-2-F1-predicted_aligned_error_v6.json\n", "412K AF-O60481-F1-model_v6.cif\n", "628K AF-O60481-F1-predicted_aligned_error_v6.json\n", "324K AF-P50613-F1-model_v6.cif\n", @@ -325,7 +345,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 9, "id": "cc96c63a", "metadata": {}, "outputs": [], @@ -343,19 +363,19 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 10, "id": "73a61cf6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "[PosixPath('alphafold_files/AF-A1YPR0-F1-model_v4.cif'),\n", - " PosixPath('alphafold_files/AF-O60481-F1-model_v4.cif'),\n", - " PosixPath('alphafold_files/AF-P50613-F1-model_v4.cif')]" + "[PosixPath('alphafold_files/AF-A1YPR0-F1-model_v6.cif'),\n", + " PosixPath('alphafold_files/AF-O60481-F1-model_v6.cif'),\n", + " PosixPath('alphafold_files/AF-P50613-F1-model_v6.cif')]" ] }, - "execution_count": 12, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } @@ -375,7 +395,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 11, "id": "fbfdf472", "metadata": {}, "outputs": [], @@ -385,10 +405,123 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 12, "id": "152aec9a", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO:distributed.http.proxy:To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy\n", + "INFO:distributed.scheduler:State start\n", + "INFO:distributed.scheduler: Scheduler at: tcp://127.0.0.1:34887\n", + "INFO:distributed.scheduler: dashboard at: http://127.0.0.1:8787/status\n", + "INFO:distributed.scheduler:Registering Worker plugin shuffle\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:41671'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:38889'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:37481'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:36369'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:45561'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:37863'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:39747'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:44891'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:34577'\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:45645'\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:34971 name: 1\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:34971\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55472\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:35811 name: 3\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:35811\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55476\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:46585 name: 0\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:46585\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55456\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:45093 name: 2\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:45093\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55482\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:42155 name: 7\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:42155\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55490\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:41473 name: 5\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:41473\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55496\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:32977 name: 8\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:32977\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55508\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:36937 name: 4\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:36937\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55518\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:39297 name: 6\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:39297\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55532\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:38585 name: 9\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:38585\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55546\n", + "INFO:distributed.scheduler:Receive client connection: Client-c7e0282d-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:55554\n", + "INFO:distributed.scheduler:Registering Worker plugin forward-logging-\n", + "INFO:distributed.scheduler:Remove client Client-c7e0282d-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55554; closing.\n", + "INFO:distributed.scheduler:Remove client Client-c7e0282d-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.scheduler:Close client connection: Client-c7e0282d-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.scheduler:Retire worker addresses (stimulus_id='retire-workers-1773759089.6439693') (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:41671'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:38889'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:37481'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:36369'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:45561'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:37863'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:39747'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:44891'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:34577'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:45645'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55456; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55472; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55482; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55476; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55518; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55496; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55532; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55490; closing.\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55508; closing.\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:46585 name: 0 (stimulus_id='handle-worker-cleanup-1773759089.6790571')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:34971 name: 1 (stimulus_id='handle-worker-cleanup-1773759089.6811926')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:45093 name: 2 (stimulus_id='handle-worker-cleanup-1773759089.6825626')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:35811 name: 3 (stimulus_id='handle-worker-cleanup-1773759089.6848328')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:36937 name: 4 (stimulus_id='handle-worker-cleanup-1773759089.6870415')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:41473 name: 5 (stimulus_id='handle-worker-cleanup-1773759089.6889753')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:39297 name: 6 (stimulus_id='handle-worker-cleanup-1773759089.6909473')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:42155 name: 7 (stimulus_id='handle-worker-cleanup-1773759089.6930795')\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:32977 name: 8 (stimulus_id='handle-worker-cleanup-1773759089.6945615')\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:55546; closing.\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:38585 name: 9 (stimulus_id='handle-worker-cleanup-1773759089.699611')\n", + "INFO:distributed.scheduler:Lost all workers\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:41671' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:37481' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:38889' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:39747' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:44891' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:37863' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:45561' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:36369' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:34577' closed.\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:45645' closed.\n", + "INFO:distributed.scheduler:Closing scheduler. Reason: unknown\n", + "INFO:distributed.scheduler:Scheduler closing all comms\n" + ] + } + ], "source": [ "output_dir = Path(\"./filtered\")\n", "output_dir.mkdir(exist_ok=True)\n", @@ -397,19 +530,50 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 13, "id": "6a6f8e3f", "metadata": {}, "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "WARNING:protein_quest.parallel:Not using all CPU cores (10) of machine, environment variable \"OMP_NUM_THREADS\" is set to 1.\n", + "INFO:distributed.scheduler:State start\n", + "INFO:distributed.scheduler: Scheduler at: tcp://127.0.0.1:46063\n", + "INFO:distributed.scheduler: dashboard at: http://127.0.0.1:8787/status\n", + "INFO:distributed.scheduler:Registering Worker plugin shuffle\n", + "INFO:distributed.nanny: Start Nanny at: 'tcp://127.0.0.1:38027'\n", + "INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:44111 name: 0\n", + "INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:44111\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:48678\n", + "INFO:distributed.scheduler:Receive client connection: Client-cd3d96b6-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.core:Starting established connection to tcp://127.0.0.1:48682\n", + "INFO:distributed.scheduler:Registering Worker plugin forward-logging-\n", + "INFO:distributed.scheduler:Remove client Client-cd3d96b6-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:48682; closing.\n", + "INFO:distributed.scheduler:Remove client Client-cd3d96b6-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.scheduler:Close client connection: Client-cd3d96b6-2210-11f1-9b29-00155d9940c2\n", + "INFO:distributed.scheduler:Retire worker addresses (stimulus_id='retire-workers-1773759098.5168622') (0,)\n", + "INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:38027'. Reason: nanny-close\n", + "INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close\n", + "INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:48678; closing.\n", + "INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:44111 name: 0 (stimulus_id='handle-worker-cleanup-1773759098.5205002')\n", + "INFO:distributed.scheduler:Lost all workers\n", + "INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:38027' closed.\n", + "INFO:distributed.scheduler:Closing scheduler. Reason: unknown\n", + "INFO:distributed.scheduler:Scheduler closing all comms\n" + ] + }, { "data": { "text/plain": [ - "[ConfidenceFilterResult(input_file='AF-A1YPR0-F1-model_v4.cif', count=175, filtered_file=PosixPath('filtered/AF-A1YPR0-F1-model_v4.cif')),\n", - " ConfidenceFilterResult(input_file='AF-O60481-F1-model_v4.cif', count=76, filtered_file=None),\n", - " ConfidenceFilterResult(input_file='AF-P50613-F1-model_v4.cif', count=244, filtered_file=PosixPath('filtered/AF-P50613-F1-model_v4.cif'))]" + "[ConfidenceFilterResult(input_file='AF-A1YPR0-F1-model_v6.cif', count=199, filtered_file=PosixPath('filtered/AF-A1YPR0-F1-model_v6.cif')),\n", + " ConfidenceFilterResult(input_file='AF-O60481-F1-model_v6.cif', count=67, filtered_file=None),\n", + " ConfidenceFilterResult(input_file='AF-P50613-F1-model_v6.cif', count=248, filtered_file=PosixPath('filtered/AF-P50613-F1-model_v6.cif'))]" ] }, - "execution_count": 17, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } @@ -430,18 +594,72 @@ "2 files have passed, but 1 file only has 75 high confidence residues so it is discarded." ] }, + { + "cell_type": "markdown", + "id": "ef3b6b52", + "metadata": {}, + "source": [ + "## Visualize a structure with Mol*\n", + "\n", + "Use [molviewspec](https://molstar.org/mol-view-spec/) to visualize one of the structures." + ] + }, { "cell_type": "code", "execution_count": null, "id": "83ffc09b", "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "application/javascript": "\n setTimeout(function(){\n var wrapper = document.getElementById(\"molstar_c397f568-afa8-462c-8155-f76e2f3ecee5\")\n if (wrapper === null) {\n throw new Error(\"Wrapper element #molstar_c397f568-afa8-462c-8155-f76e2f3ecee5 not found anymore\")\n }\n var blob = new Blob([\"\\n\\n \\n