Loading...

XML

Word

Printable

Details

Type: Task
Resolution: Canceled
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: [DEPRECATED] Algorithms, [DEPRECATED] Vertical - Monarch, [DEPRECATED] Web Services
Labels:
None

Sub-Project:
NIF
NIF Comment:
Issues closed as MONARCH has transitioned from UCSD services

Description

Until the time that we have owlsim fully integrated as services wrapping the entire NIF db, we need to do some periodic data dumps into our Jenkins host in order to regularly run the owlsim precompute pipeline.

These queries will use the NIF REST services to pull data matching a given set of guidelines. Overall, the jobs are:

1. fetching data directly from the NIF genotype/variant and phenotype tables (as specified by me), producing a series of 2-column files. Specific types/resources will be made into sub-task tickets.

2. either (a) filtering set (1) above by using our "filters", or (b) doing a fresh query for each "filter", embedding the filtering at query time. of special interest here is how to build the filtering mechanism. i would like these to be available via services too, perhaps even part of our monarch-app, but not sure about implementation here. chris, we should discuss strategies soon for filter implementation. Kent, an example of these filters is, "get me all genotype ids where there is a variation only one gene", or "get me all the genotype ids where the variation(s) are generated with treatments (such as RNAi/Morpholino). The filter types will also be separate sub-tasks

I would like the code to be generic enough that we can give it a resource id, data types, and filter(s) as arguments, and then the code will do it's business. I think the NIF-specific resource ids should probably live in an external configuration file that is not committed within the public code, considering we will need to keep an API key in there too.

NIF has nearly finished the requirements for the APIkey infrastructure necessary for us to use services to pull data without limits. in the mean time, we can write the methods with the 1000-record cap that currently exists, knowing that it will go away soon. Besides, I suppose that we might need to have a fallback mechanism built in anyway, to iterate through at 1000-record chunks at a time, in case the APIkey gets corrupted on one end or the other.

I will attach the SQL that I currently use to produce the tables to the specific tickets.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

generating_geno_pheno_files_Dr.txt
2 kB
16/Dec/13 1:41 PM

Issue Links

is related to

NIF-11649 Refactor TableGenerator script to allow optional API key

Closed

NIF-11699 add filter for "abnormal" or "normal" or "all" annotations

Closed

Sub-Tasks

1.	Fetch Mm data for precompute jobs	Closed	Kent Shefchek
2.	Fetch Dr data for precompute jobs	Closed	Kent Shefchek
3.	Fetch Hs data for precompute jobs	Closed	Kent Shefchek
4.	Fetch flybase data for precompute pipeline	Closed	Kent Shefchek
5.	Fetch Ce data for precompute jobs	Closed	Kent Shefchek

Activity

People

Assignee:: Kent Shefchek

Reporter:: Nicole Washington

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 11/Dec/13 10:44 AM

Updated:: 17/Nov/15 1:32 PM

Resolved:: 17/Nov/15 1:32 PM