use Anh Vu's OpenAI prompting to develop structured metadata about Bioconductor packages, targeting EDAM ontology and bio.tools schema
Source:R/curate_bioc.R
curate_bioc.Rduse Anh Vu's OpenAI prompting to develop structured metadata about Bioconductor packages, targeting EDAM ontology and bio.tools schema
Usage
curate_bioc(
packageName = "chromVAR",
devurl =
"https://raw.githubusercontent.com/GreenleafLab/chromVAR/refs/heads/master/README.md"
)Note
Schema completion is done with temperature set to 0.0; see edamize function for more flexibility.
Examples
if (interactive()) {
key = Sys.getenv("OPENAI_API_KEY")
if (nchar(key)==0) stop("need to have OPENAI_API_KEY set")
lk = curate_bioc()
str(lk)
}
#> Loading required namespace: reticulate
#>
#> Warning: An ephemeral virtual environment managed by 'reticulate' is currently in use.
#> To add more packages to your current session, call `py_require()` instead
#> of `py_install()`. Running:
#> `py_require(c("jsonschema==4.23.0", "openai==1.66.3", "pandas==2.2.3", "requests==2.32.3", "tiktoken==0.9.0"))`
#> Done!
#> List of 2
#> $ base_final :Dict (18 items)
#> $ edam_processed:{'topic': [{'term': 'Epigenomics', 'uri': 'http://edamontology.org/topic_3173'}, {'term': 'Functional genomics', 'uri': 'http://edamontology.org/topic_0085'}, {'term': 'Gene regulation', 'uri': 'http://edamontology.org/topic_0204'}, {'term': 'Bioinformatics', 'uri': 'http://edamontology.org/topic_0091'}], 'function': [{'operation': [{'term': 'Sequence motif discovery', 'uri': 'http://edamontology.org/operation_0238'}, {'term': 'Sequence motif recognition', 'uri': 'http://edamontology.org/operation_0239'}, {'term': 'Gene expression profiling', 'uri': 'http://edamontology.org/operation_0314'}, {'term': 'Differential binding analysis', 'uri': 'http://edamontology.org/operation_3677'}], 'input': [{'data': {'term': 'Sequence record', 'uri': 'http://edamontology.org/data_0849'}, 'format': [{'term': 'BED', 'uri': 'http://edamontology.org/format_3003'}, {'term': 'BAM', 'uri': 'http://edamontology.org/format_2572'}]}], 'output': [{'data': {'term': 'Gene expression profile', 'uri': 'http://edamontology.org/data_0928'}, 'format': [{'term': 'CSV', 'uri': 'http://edamontology.org/format_3752'}]}]}]}