Assignment #7


Data Persistence

Objective:
To create a persistent storage for genomic data extracted from Unigene data.
Source files location:
Use the file ../guest/source/unigene.txt, if accessed from your home directory on bicourse.
You may also retrieve the file from this url: unigene
This file is a portion (200 IDs) from the Homo sapiens Unigene data available from ftp://ftp.ncbi.nih.gov/repository/UniGene/Hs.data.Z
Task:
  1. Parse the "unigene" file and extract the data associated with the fields ID, TITLE, CHROMOSOME, GENE and EXPRESS
  2. Create a database with this data, using the UNIGENE id as a key.
  3. Write a simple script to query this database by CHROMOSOME, GENE or EXPRESS. Return the Unigene id and other associated data on each hit.
  4. The stored data can keep the original upper-lower case. The search should be case insensitive.

Back to course homepage.