hfit                  package:happy                  R Documentation

_F_i_t _a _m_o_d_e_l _t_o _a_n _o_b_j_e_c_t _r_e_t_u_r_n_e_d _b_y _h_a_p_p_y()

_D_e_s_c_r_i_p_t_i_o_n:

     hfit() fits a QTL model to a happy() object, for a set of markers
     specified. The model can additive or full (ie allowing for
     dominance effects). The test is a partial F-test. In the case of
     the full model two tests are performed: the full against the null,
     and the full against the additive.

     hfit.sequential() performs an automated search for multiple QTL,
     fitting marker intervals in a sequential manner, and testing for a
     QTL conditional upon the presence of previously identified QTL.
     This is essentially forward selection of variables in multiple
     regression, and very similar to composite interval mapping.

     pfit() is a conveneince function to fit several univariate
     phenotypes to the same genotype data.

     normalise() is a convenience function to convert vector of
     phenotype data into a set of standard Gaussian deviates: the
     values are first ranked and then the ranks replaced by the
     corresponding percentiles in a standard Normal distribution. This
     may be used to help map traits that are strongly non-normal (or
     use the permute argument in hfit()).

_U_s_a_g_e:

     hfit( h, markers=NULL, model='additive', mergematrix=NULL,
     covariatematrix=NULL, verbose=FALSE, phenotype=NULL, family='gaussian', permute=0 )

     hfit.sequential( h, threshold=2,  markers=NULL, model='additive',
     mergematrix=NULL, covariatematrix=NULL, verbose=FALSE, family='gaussian')

     pfit( h, phen, markers=NULL, model='additive', mergematrix=NULL,
     covariatematrix=NULL, verbose=FALSE, family='gaussian' )

     normalise(values)

_A_r_g_u_m_e_n_t_s:

       h: an object returned by a previous call to happy()

 markers: a vector of marker intervals to test. The markers can either
          be specified by name or by index. Default is NULL, in which
          case all the markers are fitted (same as setting
          markers=h$markers) 

   model: specify the type of model to be fit. Either 'additive', where
          the contrinutions of each allele at the locus are assumed to
          act additively, or 'full', in which a term for every possible
          combination of alleles is included. The default 'additive'
          mimics the behaviour of the original C HAPPY software.

mergematrix: specify a mergematrix object ( returned by mergematrices()
          ) which describes which founder strains are to be merged.
          This is used to test whether merging strains reduces
          statistical significance (see mergematrices())

covariatematrix: Optional additional matrix of covariates to include in
          the mode. These may be additional markers (terurned by
          hdesign) or covariates such as sex, age etc. 

 verbose: control whether to print the results of the fits to the
          screen, or work silently (the default)

threshold: the logP threshold used in hfit.sequential to decide whether
          to include a marker interval in the QTL model. The default is
          2, ie a marker interval must have a partial F statistic with
          P-value <0.01 (=logP 2) to be included.

  family: the distribution of errors in the data. The default is
          'gaussian'. This variable controls the type of model fitting.
          In the gauusian case a standard linear model is fitted using
          lm(). Otherwise the data are fitted as a generalised linear
          model using glm(), when the value of family must be one of
          the distributions hangled by glm(), such as 'binomial',
          'gamma'. See family() for the full range of models.

 permute: The number of permutations to perform. Default is 0, i.e. no
          permutation testing is done, and statistical significance is
          assessed by ANOVA (or Analysis of Deviance if family !=
          'gaussian') If permute>0 then statistical significance is
          assessed based on permuting the phenotypes between
          individuals, repeating the model fit, and finding the
          top-scoring marker interval. The emprical distribution of the
          max logP values is then used to assess statistical
          significance. This technique is useful for non-normally
          distributed phenotypes and for estimating region-wide
          significance levels Note that permutation testing is very
          slow.

    phen: a data.frame containing additional phenotypes. Each column
          should be numeric (only used by pfit())

phenotype: An optional  vector containing the phenotype values. Used to
          override the default phenotype in h$phenotype (hfit() only ). 

  values: a numeric vector of phenotype values to transform into normal
          deviates (normalise() only) 

_V_a_l_u_e:

     hfit() returns a list. The following components of the list are of
     interest: 

   table: a table with the log-P values of the F statistics. The table
          contains rows, one per marker interval. The columns are the
          negative base-10 logarithms of the F-test P-values that there
          is no QTL in the marker interval. In the case of
          model='full', the partial F-test that the full model is no
          better than the additive is also given.

          In the special case of model='additive' and verbose=TRUE the
          effects of all estimable strains are compared with a T-test,
          taking into account the correlations between these estimates.
          However, it should be noted that estimates of individual
          strain effects may be hard to interpret when some
          combinations of strains are indistinguishable, and it is
          possible for the overall F-statistic to be very significant
          whilst none of the strains appear to be significant, based on
          their T-statistics. The F-statistic is a better indicator of
          the true overall fit of the model.   

permdata: a list containing the results of the permutation analysis, or
          NULL if permute=0. The list contains the following elements:

     _N The number of permutations 

     _p_e_r_m_u_t_a_t_i_o_n._d_i_s_t A vector containing sorted ANOVA logP values from
          the N permutations. These values can be used to estimate the
          shape of the null distribution, and plotted e.g. using
          hist().

     _p_e_r_m_u_t_a_t_i_o_n._p_v_a_l A data table containing the permutation p-values
          for each marker interval. The columns in the datatable give
          the position in cM, the marker name (left-hand marker in the
          interval), the original ANOVA logp, the permutation pval for
          this logp, and the log permutation P-value. Bothe global (ie
          region-wide) and pointwise pvalues are given. The Global
          pvalue for a marker interval is the fraction of times that
          the logP for the interval (either additive or full, depending
          on the model specified) is exceeded by the maximum logP in
          all intervals for permuted data. The pointwise pvalue is the
          fraction of permutation logP at the marker interval that
          exceed the logP for that interval.


     The object returned by hfit() is suitable for plotting with
     happyplot()

     pfit() returns a list of hfit() objects, the n'th being the fit
     for the n'th column (phenotype) in phen.

_A_u_t_h_o_r(_s):

     Richard Mott

_S_e_e _A_l_s_o:

     happy{}

_E_x_a_m_p_l_e_s:

     ## An example session:
     # initialise happy
     ## Not run: h <- happy('Hs.data','HS.alleles')
     # fit all the markers with an additive model
     ## Not run: f <- hfit(h)
     # plot the results
     ## Not run: happyplot(f)
     # fit a non-additive model
     ## Not run: ff <- hfit(h, model='full')
     # view the results
     ## Not run: write.table(ff,quote=F)
     # plot the results
     ## Not run: happyplot(ff)
     # use noramlised trait values
     ## Not run: ff <- hfit(h,phenotype=normalise(h$phenotypes))
     # permutation test with 1000 permutations 
     ## Not run: ff <- hfit(h, model='full', permute=1000)

