mergeprepare              package:happy              R Documentation

_P_e_r_f_o_r_m _t_e_s_t_s _t_o _d_e_t_e_r_m_i_n_e _w_h_e_t_h_e_r _i_n_d_i_v_i_d_u_a_l
_p_o_l_y_m_o_r_p_h_i_s_m_s _c_o_u_l_d _h_a_v_e _g_i_v_e_n _r_i_s_e _t_o _a _Q_T_L

_D_e_s_c_r_i_p_t_i_o_n:

     mergeprepare() reads in datafiles descrbing the locations and
     strain distribution patterns of polymorphisms (SNPs or otherwise)
     which have not necessarily been genotyped. The following tasks are
     performed:

        1.  the polymorphism data are read in from testmarkerfile.For
           each polymorphism the corresponding sketon marker interval
           is determined, based on their coordinates. Only those
           polymorphisms lying inside a skeleton marker interval are
           retained.

        2.  the coordinates (typically in bp rather than cM) of the
           genotyped markers are read in from markerposfile. Note that
           these coordinates are distinct from those in the cM map in
           h$map used in happy(). Only those markers listed in
           markerposfile that are also in h$markers are retained - the
           rest are discarded. The retained markers are referred to as
           'skeleton' markers as they define a framework of genotype
           data that can us used to test the significance of other
           polymorphisms.

     mergefit() tests each of the polymorphisms to see if it could be a
     QTL. It performs the following operations on each polymorphism:

        1.  The founder strains are merged together based on the strin
           distribution pattern for that polymorphism.

        2.  The merged data are used to fit a QTL in the corresponding
           skeleton marker interval

        3.  The unmerged data are used to fir a QTL in the
           corresponding skeleton marker interval.

        4.  The fits of the merged and unmerged data are compared with
           a partial F-test. If the unmerged data are significant but
           the merged data are not then there is evidence to reject the
           polymorphism as being associated with the trait.

     fastmergefit() is a convenience function which perfroms a complete
     analysis without making a prior call to happy().

     condmergefit() performs a conditional analysis in which each
     variant is fitted conditional upon every other variant being
     included in turn. This is VERY SLOW.

_U_s_a_g_e:

     mergeprepare( h, markerposfile, testmarkerfile, verbose=FALSE )
     mergefit( h, mergedata,  model='additive', covariatematrix=NULL,
     verbose=FALSE )
     fastmergefit( datafile, allelesfile, markerposfile,
     testmarkerfile, generations=200, model='additive', verbose=FALSE )
     condmergefit( h, mergedata, model='additive', covariatematrix=NULL,
     verbose=FALSE )

_A_r_g_u_m_e_n_t_s:

       h: an object returned by a previous call to happy()

markerposfile: the name of a text file containing the names and
          locations of the genotyped markers. Contains two names
          columns 'marker' and 'POSITION'

testmarkerfile: the name of a text file containign the names, positions
          and strain/allele distribution patterns for each polymorphism
          to be tested. Contains two columns 'marker' and 'POSITION'
          plus an additional named column for each of the strains
          listed in h$strains - _the column names and strain names must
          match exactly_. 

 verbose: switch to control the level of ouput sent to the screen

mergedata: an object created by a previous call to mergeprepare() 

   model: determine the type of model to be fitted - either 'additive'
          or 'full'.

          For the additive model it is assumed that the contribution to
          the phenotype from each chromosome is additive, ie if the
          founder strains at the locus being tested are s,t then the
          expected phenotype will be of the form T(s)+T(t).

          For the full model the expected phenotype will be of the form
          T(s,t).

          Analysis of variance is used to test for differences between
          the estimated effects T(s), T(s,t).

          The additive model is a submodel of the full, so for
          model='full' in addition a partial F-test is performed to
          test if the full model explains more variance than the
          additive. 

covariatematrix: an optional design matrix which can be used to include
          additional terms in the model, such as other markers (using
          the matrix returned by hdesign()) and/or other covariates
          such as sex, age etc 

datafile: the name of a genotype datafile to be passed to happy()

allelesfile: the name of the corresponding alleles datafile to be
          passed to happy() 

generations: the number of generations to be passed to happy() 

_V_a_l_u_e:

     mergeprepare() returns a list with the following named elements:

markerpos: the positions of the markers

interval: an array. interval[m] contains the index of the genotyped
          marker interval in which the polymorhism p is located, or
          NULL if it is outside all genotyped intervals.  

 markers: 

testmarkerdata: details about the polymorphisms to be tested


     mergefit() and fastmergefit() return an object, called say 'fit',
     suitable for plotting using mergeplot(). It contains a named
     element 'table' containing the log-P values as in hfit(), which
     can be printed using 'write.table(fit$table)'.

     condmergefit() returns a table with columns  "position",
     "interval", "sdp", "logPself", "logPmax", "logPmaxPosition" .

_A_u_t_h_o_r(_s):

     Richard Mott

_S_e_e _A_l_s_o:

     happy(), mergeplot()

_E_x_a_m_p_l_e_s:

     ## An example session:
     # initialise happy
     ## Not run: h <- happy('Hs.data','HS.alleles')
     # prepare the merge files
     ## Not run: prep <- mergeprepare('markers.positions','testmarkers.txt')
     # run the merge fit
     ## Not run: fit <- mergefit( h, prep )
     # alternative, and equivalent, use of fastmergefit():
     ## Not run: 
     fit <- fastmergefit( 'Hs.data','HS.alleles',
     'markers.positions','testmarkers.txt' )
     ## End(Not run)
     # plot the results
     ## Not run: mergeplot( fit, prep )

