BioContext7

Python SDK

Use the BioContext7 Python SDK for programmatic access to bioinformatics tools

Installation

pip install biocontext7

Or from source:

pip install -e "/path/to/biocontext7/packages/biocontext7"

Quick Start

import asyncio
from bio_pipeline_maker import PipelineMaker
 
async def main():
    maker = PipelineMaker()
    result = await maker.create(
        intent="RNA-seq from FASTQ to counts",
        target="nextflow",
    )
    if result.success:
        print(result.files["main.nf"])
 
asyncio.run(main())

PipelineMaker

The core class for pipeline generation.

PipelineMaker()

Create a new pipeline maker instance.

from bio_pipeline_maker import PipelineMaker
 
maker = PipelineMaker()

maker.create(intent, target, **options)

Generate a pipeline from a natural language description.

result = await maker.create(
    intent="RNA-seq pipeline from FASTQ to gene counts",
    target="nextflow",  # "nextflow", "snakemake", "wdl", "cwl"
)
 
# result.success: bool
# result.files: dict[str, str]
# result.tools_discovered: int
# result.lsp_iterations: int
# result.stub_test_passed: bool

Registry Clients

BioToolsClient

Search and query the bio.tools registry.

from bio_pipeline_maker.registry import BioToolsClient
 
async with BioToolsClient() as client:
    # Free-text search
    results = await client.search("alignment")
 
    # Search with EDAM filter
    results = await client.search(
        "alignment",
        operation="operation:0292"
    )
 
    # Get tool details
    tool = await client.get_tool("star")
 
    # Search by EDAM term
    tools = await client.search_by_edam(
        "http://edamontology.org/operation_0492",
        annotation_type="operation",
    )

UniProtClient

Query the UniProt protein database.

from bio_pipeline_maker.registry import UniProtClient
 
async with UniProtClient() as client:
    # Get protein by accession
    protein = await client.get_protein("P53_HUMAN")
 
    # Search proteins
    results = await client.search("tumor suppressor")
 
    # ID mapping
    mapped = await client.map_ids(
        ids=["P04637"],
        from_db="UniProtKB_AC-ID",
        to_db="PDB",
    )

HMDBClient

Query the Human Metabolome Database.

from bio_pipeline_maker.registry import HMDBClient
 
async with HMDBClient() as client:
    # Search metabolites
    results = await client.search("glucose")
 
    # Get metabolite details
    metabolite = await client.get_metabolite("HMDB0000122")

BeaconClient

Federated variant queries using the GA4GH Beacon protocol.

from bio_pipeline_maker.registry import BeaconClient
 
async with BeaconClient("https://beacon.example.org") as client:
    response = await client.query(
        chromosome="1",
        position=12345,
        reference="A",
        alternate="G",
    )

EDAMOntology

EDAM ontology term resolution.

from bio_pipeline_maker.registry import EDAMOntology
 
edam = EDAMOntology()
 
# Look up operations
uri = edam.lookup_operation("align sequences")
 
# Look up topics
topic = edam.lookup_topic("genomics")
 
# Suggest pipeline tools
pipeline = edam.suggest_pipeline("RNA-seq differential expression")

Error Handling

All clients raise specific exceptions:

from bio_pipeline_maker.exceptions import (
    ToolNotFoundError,
    RegistryConnectionError,
    ValidationError,
)
 
try:
    tool = await client.get_tool("nonexistent-tool")
except ToolNotFoundError:
    print("Tool not found in registry")
except RegistryConnectionError:
    print("Could not connect to registry")

Async Context Managers

All registry clients support async context managers for automatic connection cleanup:

async with BioToolsClient() as client:
    # Connection is managed automatically
    results = await client.search("alignment")
# Connection is closed here

On this page