Uploaded image for project: 'NIF'
  1. NIF
  2. NIF-11878

DISCO: import additional data for GEO

    XMLWordPrintable

Details

    • NIF
    • Issues closed as MONARCH has transitioned from UCSD services

    Description

      DISCO id: nif-0000-00142

      Currently you host three tables... contributor, platform, and series. However, the real meat describing a sample is at the "Sample" level. I'd like to add that. Looking at the current interop file, it looks like you run some shell script to generate a set of tab-delimited files, so I'm not sure what your host data url is.
      Here's an example record from GEO: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM675693. Notice the ids are prefixed with "GSM".

      Certainly these can be retrieved with their REST interface, like this, but you might have some other method to get at it. And, we'd want the actual information about the samples, not just the ids.

      Construct and perform an eSearch in db=gds to retrieve all Series records using:

      http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=gds&term=GSE[ETYP]&retmax=5000&usehistory=y

      Use the query_key and WebEnv parameters from the eSearch to perform an eSummary:

      http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=gds&query_key=X&WebEnv=ENTER_WEBENV_PARAMETER_HERE

      This retrieves summary documents for all Series records.

      Within those summary documents, 'GSM_L' lists the Sample accession numbers for each Series.

      Attachments

        Activity

          People

            nlw Nicole Washington
            nlw Nicole Washington
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: