Saturday, June 23, 2007
Mapping Agilent Microarray ID to Affymetrix GeneChip using ActiveRecord
First, I make ActiveRecord model script (ensembl.rb) for Agilentprobe
table in EnsemblMart.
#!/usr/bin/env ruby
require 'rubygems'
require_gem 'activerecord'
ActiveRecord::Base.pluralize_table_names = false
ActiveRecord::Base.establish_connection(
:adapter => 'mysql',
:database => 'ensembl_mart_40',
:host => 'ensembldb.ensembl.org',
:username => 'anonymous',
:password => ''
)
class Agilentprobe <>
Next, I make ID translation program (affyprobe2moe4302.rb) using above script.
#!/usr/bin/env ruby
require 'ensembl'
agilentid = 'A_52_P654624'
agilentprobe = Agilentprobe.find_by_dbprimary_id('A_52_P654624')
moe4302 = Moe4302.find_all_by_gene_id_key(agilentprobe.gene_id_key)
moe4302.each do |entry|
puts [aglentid,
entry.gene_stable_id, entry.transcript_stable_id, entry.translation_stable_id,
entry.dbprimary_id].join("\t")
end
Let's run it!$ ruby affyprobe2moe4302.rb
A_52_P654624 ENSMUSG00000058140 ENSMUST00000081249 ENSMUSP00000080006 1429452_x_at
A_52_P654624 ENSMUSG00000058140 ENSMUST00000081249 ENSMUSP00000080006 1435353_a_at
A_52_P654624 ENSMUSG00000058140 ENSMUST00000071829 ENSMUSP00000071732 1429452_x_at
Modeling Ensembl Mart using ActiveRecord
First, I make a Ruby script for modeling ensembl mart.
#!/usr/bin/env ruby
require 'rubygems'
require_gem 'activerecord'
ActiveRecord::Base.pluralize_table_names = false
ActiveRecord::Base.establish_connection(
:adapter => 'mysql',
:database => 'ensembl_mart_40',
:host => 'ensembldb.ensembl.org',
:username => 'anonymous',
:password => ''
)
class Agilentprobe < ActiveRecord::Base
set_table_name "mmusculus_gene_ensembl__xref_agilentprobe__dm"
set_primary_key "gene_id_key"
end
Second, I make a searching program for Agilentprobe talbe in
Ensembl Mart using above modeling code.
#!/usr/bin/env ruby
require 'ensembl'
agilentprobe = Agilentprobe.find_by_dbprimary_id('A_52_P654624')
puts agilentprobe.dbprimary_id
puts agilentprobe.gene_id_key
puts agilentprobe.gene_stable_id
puts agilentprobe.transcript_id_key
puts agilentprobe.transcript_stable_id
puts agilentprobe.translation_id
Let's run it!
$ ruby run_ensembl.rb
A_52_P654624
209556
ENSMUSG00000058140
263491
ENSMUST00000081249
251942
ENSMUSP00000080006
One-liner for mapping Agilent Microarray ID to Affymetrix Genechip ID
echo "
SELECT
mmusculus_gene_ensembl__xref_agilentprobe__dm.dbprimary_id,
mmusculus_gene_ensembl__xref_agilentprobe__dm.gene_stable_id,
mmusculus_gene_ensembl__xref_affy_mouse430_2__dm.dbprimary_id
FROM
mmusculus_gene_ensembl__xref_agilentprobe__dm,
mmusculus_gene_ensembl__xref_affy_mouse430_2__dm
WHERE
mmusculus_gene_ensembl__xref_agilentprobe__dm.dbprimary_id = 'A_52_P654624' AND
mmusculus_gene_ensembl__xref_agilentprobe__dm.gene_id_key = /
mmusculus_gene_ensembl__xref_affy_mouse430_2__dm.gene_id_key;" /
| mysql -h ensembldb.ensembl.org -A -uanonymous -D ensembl_mart_40
You need mysqlclient.
Sunday, June 03, 2007
One-liner for mapping Affymetrix Probe ID to NCBI Entrez Gene ID :-)
You can map Affymetrix IDs to NCBI Entrez Genes by one-liner command.echo 'library(mouse4302)\nwrite.table(unlistThe result was written to "affyid2genename.txt" file:
(as.list(mouse4302GENENAME)),
file="affyid2genename.txt", sep="\\t", eol="\\n",
quote=F, col.names=F)' | R --no-saveless affyid2genename.txtYou need R, Bioconductor and mouse4302 package.
1459626_at NA
1457754_at RIKEN cDNA 4930430F08 gene
1452692_a_at NADH dehydrogenase (ubiquinone) flavoprotein 2
:
:
Making a gene expression database within 11 min using Ruby on Rails
Ruby on Rails is very powerful web application framework. This site shows that bioinformatics scientist can make a simple viewer for gene expression data within 11 min using Ruby on Rails.
See following the movie:
http://itoshi.tv/data/geneexp/RoR.html
See following the movie:
http://itoshi.tv/data/geneexp/RoR.html
Saturday, June 02, 2007
We need big memory for exon array analysis
In addition, the oligo package will require you to make a PDEnvvia Re: Human exon array chip
with the HuEx-l_OST-V2.cdf file, which I am unable to do on
a 64 bit Linux box with 8 Gb RAM (I think 16 Gb is sufficient).
Exonmap (Bioconductor package for exon array analysis)
requires big memory.
via Affy EXON Array: iterPLIER or RMAIt is based on the affy exon probeset mappings. It can run on
a machine with 4Gb of memory, but can easily use 8-12Gb depending
on what your trying to do. There are other probeset environments
available for this platform at Bioconductor (refseq, ensembl, etc.)
which are much more memory friendly.
via Re: Affymetrix exon arrays?via Re: Bioconductor package for Affymetrix Exon arrays.
The oligo/makePlatformDesign packages can be used for the
exon arrays. However, you should know that these arrays are
incredibly huge, so unless you have a 64 bit OS and around
16 - 32 Gb RAM you won't be able to do anything with them.
You must read documents if you analysis of exon array
Quality assessment
Gene-level expression
- White Paper,Quality Assessment of Exon Arrays (pdf, 85 KB)
- Whitepaper, Quality Assessment of Exon and Gene Array
Gene-level expression
- Gene Level Summarization on Exon Arrays (pdf, 747 KB)
- Gene Signal Estimates from Exon Arrays v1.0 (pdf, 428 KB)
Useful websites for Exon Array analysis
GeneChip All exon array official information
Mouse exon array offical information
Software
Summarize, Background model, probe selection
Alternative splicing detection
Analysis
Viewer
Mouse exon array offical information
- Affymetrix - Technical Support Documentation for GeneChip® Mouse Exon 1.0 ST Array
- Affymetrix - GeneChip� Mouse Exon 1.0 ST Array - Design Statistics Summary
Software
- exonmap (R + Bioconductor, You need big memory (4 - 8GB))
- Affymetrix - Affymetrix Power Tools (Offical software, CUI, small memory. I recommend it.)
- dChip: Exon Array Analysis (Calculate the dChip value in Exon array)
- GeneChip-compatible™ Software Solutions Catalog (Commercial software)
Summarize, Background model, probe selection
- Exon array assessment of gene expression. (probe-specific background correction and Probe selection strategy)
- Probe Selection and Expression Index Computation of Affymetrix Exon Arrays
Alternative splicing detection
- A statistical framework for genome-wide discovery of biomarker splice variations with GeneChip Human Exon 1.0 ST Arrays. (Download paper)
Analysis
Viewer
- X:Map (using Google MAP API. Very Cool!)
- An annotation infrastructure for the analysis and interpretation of Affymetrix exon array data.
- ArrayFusion 1.5.5�- a web application for multi-dimensional analysis of CGH, SNP and microarray data (Web application)
- Galaxy
- X:Map Downloads (You can download CDF file for exon array)
- MAASE: An alternative splicing database designed for supporting splicing microarray applications -- ZHENG et al. 11 (12): 1767 -- RNA
Subscribe to:
Posts (Atom)