US5427744A - Sequencing of oligosaccharides - Google Patents

Sequencing of oligosaccharides Download PDF

Info

Publication number
US5427744A
US5427744A US08/140,143 US14014393A US5427744A US 5427744 A US5427744 A US 5427744A US 14014393 A US14014393 A US 14014393A US 5427744 A US5427744 A US 5427744A
Authority
US
United States
Prior art keywords
oligosaccharide
structures
unknown
sequencing
enzyme
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US08/140,143
Inventor
Rajesh B. Parekh
Sally B. Prime
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oxford GlycoSystems Ltd
Original Assignee
Oxford GlycoSystems Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oxford GlycoSystems Ltd filed Critical Oxford GlycoSystems Ltd
Assigned to OXFORD GLYCOSYSTEMS LTD reassignment OXFORD GLYCOSYSTEMS LTD ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: PAREKH, RAJESH BHIKHU, PRIME, SALLY BARBARA
Application granted granted Critical
Publication of US5427744A publication Critical patent/US5427744A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/34Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/54Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving glucose or galactose
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/90Enzymes; Proenzymes
    • G01N2333/914Hydrolases (3)
    • G01N2333/924Hydrolases (3) acting on glycosyl compounds (3.2)
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10TTECHNICAL SUBJECTS COVERED BY FORMER US CLASSIFICATION
    • Y10T436/00Chemistry: analytical and immunological testing
    • Y10T436/14Heterocyclic carbon compound [i.e., O, S, N, Se, Te, as only ring hetero atom]
    • Y10T436/142222Hetero-O [e.g., ascorbic acid, etc.]
    • Y10T436/143333Saccharide [e.g., DNA, etc.]

Definitions

  • the present invention relates to the analysis of oligosaccharides and more particularly to the form of analysis known as sequencing of oligosaccharides.
  • Oligosaccharides form a class of chemical compounds which are each made up of a number of monosaccharide units linked together by glycosidic bonds.
  • Important sources of naturally occurring oligosaccharides are glycoproteins in which oligosaccharides are found linked to a peptide chain by either an N-glycosidic bond or by an O-glycosidic bond; these oligosaccharides may vary up to highly branched structures containing many (e.g. over 30) monosaccharide units.
  • a single reaction is performed with a particular reagent on a pure sample of the unknown oligosaccharide and its products analysed.
  • the reagent is chosen such that the product of the reaction will reveal the presence or absence of a particular structural sub-unit or sub-units which may or may not occur in the sample.
  • a further reaction may be carried out using a different reagent, either with more of the original pure sample or with the products of the previous reaction.
  • sequencing agent An agent which assists in this process of structural analysis may be regarded as a "sequencing agent".
  • a sequencing agent is a chemical or biochemical reagent the sequencing agent may be regarded as a sequencing reagent.
  • Examples of sequencing reagents are enzymes and typical enzymes used in sequencing oligosaccharides are listed in the Enzyme Rules Table set out below.
  • a sequencing agent may be chosen such that the products obtained when it is brought into contact with an unknown oligosaccharide will reveal (in later analysis) the presence or absence of a particular structural sub-unit (e.g. a monosaccharide unit) in the oligosaccharide. It will also be appreciated that by sequentially contacting either samples of the original oligosaccharide or products thereof (produced by previously contacting samples of the oligosaccharide with one or more sequencing reagents) with a sequencing reagent, it is possible to deduce that the unknown oligosaccharide contains a certain structure or structures or does not contain a certain structure or structures.
  • a particular structural sub-unit e.g. a monosaccharide unit
  • the presence and linkage of a particular monosaccharide at the terminus of an oligosaccharide may be confirmed by the ability of a given sequencing agent, such as an enzyme, to cause cleavage of that linkage; thus, if cleavage occurs when that enzyme is brought together with the oligosaccharide, then detection of that particular monosaccharide (now detached from the oligosaccharide) in the products of the reaction will confirm the presence of that monosaccharide and will define the position and orientation of the linkage of that monosaccharide in the original oligosaccharide structure.
  • a given sequencing agent such as an enzyme
  • a sequencing agent or reagent can be an agent or reagent which either cleaves or forms chemical bonds when brought into contact with an unknown oligosaccharide.
  • the effectiveness of the sequencing method depends critically upon the choice of agent to be used at each stage of each reaction process and the accuracy of the interpretation of the results of each analysis.
  • a good choice of agent requires an experienced operator who has already made some intelligent guesses about the type of structure being dealt with.
  • a poor choice may result in little or no additional information being revealed by a particular experiment, resulting in the process being wasteful of both time and materials.
  • the operator may assign a single structure which is consistent with the experimental results, whereas in reality there may be more than one structure consistent with the same set of observations.
  • a further difficulty arises in defining the point at which no further information can be revealed by the use of the agents available.
  • the sequencing of oligosaccharides may be assisted by a knowledge of the biosynthetic pathways involved in building up the oligosaccharide structures.
  • N-linked oligosaccharides N-glycans
  • monosaccharides may only add on in certain well defined orders and branching patterns.
  • This knowledge may be used to develop the concept of ⁇ basic ⁇ or ⁇ core ⁇ structures for oligosaccharides from which sets of sub-structures can be generated by performing specific transformations on the ⁇ basic ⁇ structures.
  • FIGS. 1 to 4 three ⁇ basic ⁇ structures are shown respectively in FIGS. 1 to 4.
  • the illustrated ⁇ basic ⁇ structures are not all the structural possibilities for N-glycans and are shown by way of example only.
  • FIGS. 5-8 show four groups of structural possibilities with a given set of monosaccharides.
  • the transformation used to generate the sub-structures is successive deletion of terminal monosaccharides.
  • Numerous sub-structures can thus be formed starting with a ⁇ basic ⁇ structure and, for example, successively deleting monosaccharides in all possible ways.
  • the number of sub-structures which can be generated by transformation from the three illustrated sub-structures is, therefore, very large.
  • an initial analysis will allow a first judgement to be made that the oligosaccharide belongs to a particular set of sub-structures. This leads to the term "candidate structure" which is used herein to refer to a structure which is included in the set of sub-structures as being a possible candidate for having the same structure as the unknown oligosaccharide.
  • the oligosaccharide to be sequenced has been purified from a mixture of oligosaccharides released from a glycoprotein by the use of enzyme peptide-N-glycosidase F, then it can be assumed that the oligosaccharide is an N-glycan and that its structure is likely to be one of the sub-structures generated from ⁇ basic ⁇ structures similar to the ones illustrated in FIGS. 1 to 3.
  • the present invention provides a way of optimising the use of sequencing agents, interpreting the results unambiguously and determining the point where no further sequencing with the available reagents is possible.
  • apparatus for sequencing an unknown oligosaccharide comprising first means for deducing a set of oligosaccharide structures of which the unknown oligosaccharide is assumed to be a member, second means for simulating the effect of applying each one of a series of sequencing agents to each of said set of oligosaccharide structures and third means for determining the sequencing agent likely to give the most structural information if applied to the unknown oligosaccharide.
  • the invention also provides a method of sequencing an unknown oligosaccharide comprising deducing a set of oligosaccharide structures of which the unknown oligosaccharide is assumed to be a member, simulating the effect of applying each one of a series of sequencing agents to each of said set of oligosaccharide structures and determining the sequencing agent likely to give the most structural information if applied to the unknown oligosaccharide.
  • the invention includes performing an initial analysis whereby to identify at least one basic structure (as herein defined) from which the said set of oligosaccharide structures can be generated.
  • the invention includes contacting the unknown oligosaccharide with the agent or agents selected in accordance with a set of rules for the use of sequencing agents and comparing the actual experimental result with the calculation.
  • the method of the present invention postulates an initial set of candidate structures and allows a calculation to be made to determine the most efficient experiment to be performed at any given time. Structures inconsistent with the results of this experiment are then eliminated from the existing set and the next experiment determined based on the structures remaining. The process is repeated until no further elimination is possible with the given sequencing agents.
  • the initial composition analysis indicates that the unknown oligosaccharide is a member of a set of candidate structures denoted [A, B, C, D] and that enzymes 1, 2, 3, 4 are the available sequencing reagents.
  • the results of applying these enzymes to these structures can be predicted by applying enzyme rules such as those set out in the Enzyme Rules Table above, to each structure in turn.
  • the rules might indicate that enzyme 1 will cleave the monosaccharide mannose from each of the candidate structures as follows,
  • the question to be answered in this example is, which, if any, of the candidate set [A, B, C, D] is the actual structure of the unknown sample. It must be determined which of the available sequencing reagents (in this example enzymes 1 to 4) is the best one to use to help find the answer to that question most efficiently. This is done by considering a Breakdown Result (a) for each enzyme derived from data in the above tables. A Breakdown Result groups the candidate structures into subsets where all the members of that subset give the same result in terms of the number of monosaccharides cleaved when reacted with the enzyme.
  • a Breakdown Entropy (b) can be calculated to give a numerical measure of the efficiency of each enzyme.
  • the Breakdown Entropy (b) is the sum over all the subsets in a Breakdown Result (a) of the term,
  • N(i) is the length of the ith subset, i.e. the number of candidate structures in the ith subset.
  • the Breakdown Entropy of an enzyme is a direct measure of the scatter of the structures in the candidate set among different possible experimental outcomes from applying that enzyme. When all experimental outcomes are the same, the Breakdown Entropy has its minimum value (as in the above example for enzyme 3), and it is clear that it is impossible to distinguish between the candidate structures by performing that experiment. Conversely, if all experimental outcomes are distinct, the Breakdown Entropy is identically zero, which is its maximum possible value, and performing that experiment will identify uniquely the structure being sought. In general, the enzyme with the maximum Breakdown Entropy is chosen so as to give the highest statistical chance of rapidly eliminating structures from the set of candidates. In the example above therefore, enzyme 1 would be chosen.
  • the Breakdown Entropy is the primary index of evaluation in the search for the best sequencing experiment. An enzyme can be eliminated from further consideration in the search under the following criteria,
  • a first generation search i.e. a search on the original sample
  • a second generation search i.e. a search on the residue from the first generation search
  • three generation, fourth generation, etc., searches are performed, until the point is reached where the candidate structures are deemed indistinguishable because all enzyme mixtures being tested would reduce all candidate structures to a common residue.
  • the method can be carried out as a series of logical steps, not all of which need to be performed in every case:
  • a further set of calculations can be performed by re-defining the set of candidate structures as the subset of structures identified in the Breakdown Result.
  • the sample used for analysis was NA2F, which has the composition shown in FIG. 4.
  • FIG. 5-9 variants
  • FIG. 6-18 variants
  • FIG. 7-12 variants
  • FIG. 8-6 variants
  • FIGS. 7 and 8 are eliminated by the first experiment, because these would yield mannose if treated with the alpha-mannosidase, whereas the putative test sample yielded none.
  • the variants of FIG. 6 are eliminated by the second experiment, which yielded two galactose, whereas the presence of the outer arm fucose branch in FIG. 6 blocks the action of the galactosidase on that branch.
  • the nine remaining structures of FIG. 5 only the structure given would yield two N-Acetyl Glucosamine entities when reacted with the enzymes in the sequence given.

Abstract

PCT No. PCT/GB92/00831 Sec. 371 Date Nov. 3, 1993 Sec. 102(e) Date Nov. 3, 1993 PCT Filed May 7, 1992 PCT Pub. No. WO92/19766 PCT Pub. Date Nov. 12, 1992.The present invention relates to the analysis of unknown oligosaccharides and more particularly the form of analysis in which an unknown oligosaccharide is sequentially brought together with different agents which cause monosaccharides to be cleaved from (or chemically bound to) the oligosaccharide. If cleavage occurs the monosaccharide will subsequentially be detected in the products of the experiment thus confirming that the monosaccharide was attached to the unknown oligosaccharide and providing a means of determining the structure of the unknown oligosaccharide. The invention proposes an analytical technique for determining the best agent to be used in performing such an experiment so that a maximum amount of information is obtained from the experiment.

Description

This is a continuation of PCT/GB92/00831 filed on Apr. 7, 1992 which is now WO 92/19766.
The present invention relates to the analysis of oligosaccharides and more particularly to the form of analysis known as sequencing of oligosaccharides.
Oligosaccharides form a class of chemical compounds which are each made up of a number of monosaccharide units linked together by glycosidic bonds. Important sources of naturally occurring oligosaccharides are glycoproteins in which oligosaccharides are found linked to a peptide chain by either an N-glycosidic bond or by an O-glycosidic bond; these oligosaccharides may vary up to highly branched structures containing many (e.g. over 30) monosaccharide units.
In a sequential analysis of an unknown oligosaccharide structure, a single reaction is performed with a particular reagent on a pure sample of the unknown oligosaccharide and its products analysed. The reagent is chosen such that the product of the reaction will reveal the presence or absence of a particular structural sub-unit or sub-units which may or may not occur in the sample. Based on the results of this analysis, a further reaction may be carried out using a different reagent, either with more of the original pure sample or with the products of the previous reaction. From the analysis of the second reaction further conclusions may be drawn about the structure of the original oligosaccharide sample and the process repeated as often as necessary, until as much information as possible has been extracted about the original oligosaccharide structure with the reagents available or until supplies of the pure sample and derived products have been exhausted. The end point of any analysis of an oligosaccharide structure, whether the analytical technique is sequential or not, is to deduce information concerning the oligosaccharide structure; this information includes,
(i) the type of each monosaccharide unit in the oligosaccharide,
(ii) the order in which the monosaccharide units are arranged in the oligosaccharide,
(iii) the position of linkages between each of the monosaccharide units (e.g. 1-3, 1-4), and hence any branching pattern,
(iv) the orientation of the linkage between each of the monosaccharide units (i.e. whether a linkage is an alpha linkage or a beta linkage).
An agent which assists in this process of structural analysis may be regarded as a "sequencing agent". Where a sequencing agent is a chemical or biochemical reagent the sequencing agent may be regarded as a sequencing reagent. Examples of sequencing reagents are enzymes and typical enzymes used in sequencing oligosaccharides are listed in the Enzyme Rules Table set out below.
__________________________________________________________________________
Enzyme Rules Table                                                        
The following is a list of the enzymes commonly used for cleaving         
monosaccharides                                                           
from N-linked oligosaccharides and the rules showing which                
monosaccharides are                                                       
cleaved by each of these enzymes and from which part of the               
oligosaccharide                                                           
structure cleavage can be expected.                                       
                   Monosaccharide                                         
Enzyme             Cleaved Rules for Cleavage                             
__________________________________________________________________________
 1)                                                                       
  Achatina fulica beta mannosidase                                        
                   Mannose 1-beta-4 to any site                           
 2)                                                                       
  A.saitoi alpha mannosidase                                              
                   Mannose 1-alpha-2 to any site                          
 3)                                                                       
  Jack bean alpha mannosidase                                             
                   Mannose 1. mannose 1-alpha-2 to any site               
                           2. cleaves the 1-alpha-3,6                     
                           mannoses if there is no bisect                 
                           on the middle mannose                          
 4)                                                                       
  Jack bean alpha mannosidase                                             
                   Mannose Different from 3) in that it                   
  (under arm-specific conditions)                                         
                           will not cleave the mannose                    
                           1-alpha-6 case when restricted                 
                           by a side arm longer than 1 unit               
                           out from the middle mannose                    
 5)                                                                       
  Bovine testis beta galactosidase                                        
                   Galactose                                              
                           1-beta-3,4 to any non-branched                 
                           site                                           
 6)                                                                       
  Jack bean beta galactosidase                                            
                   Galactose                                              
                           1-beta-4 to any non-branched                   
                           site                                           
 7)                                                                       
  C.lampas beta xylosidase                                                
                   Xylose  1-beta-any to any site                         
 8)                                                                       
  S.pneum beta N-acetyl                                                   
                   N-acetyl                                               
                           1. 1-beta-3 to any non-branched                
  hexosaminidase   hexosamine                                             
                           site                                           
                           2. 1-beta-2 mannose 1-alpha-3 or               
                           6 to middle mannose provided                   
                           the first mannose does not                     
                           have a bond at atom 6, and                     
                           also in the case of 1-alpha-6                  
                           that the middle mannose is                     
                           not bisected                                   
 9)                                                                       
  Jack bean beta N-acetyl                                                 
                   N-acetyl                                               
                           1-beta-any to any site                         
  hexosaminidase   hexosamine                                             
10)                                                                       
  Bovine epididymis alpha fucosidase                                      
                   Fucose  1-alpha-3,4,6 to any site                      
11)                                                                       
  C.lampas alpha fucosidase                                               
                   Fucose  1-alpha-2,3,4,6 to any site                    
12)                                                                       
  Coffee bean alpha galactosidase                                         
                   Galactose                                              
                           1-alpha-3,6 to any non-branched                
                   site                                                   
13)                                                                       
  Almond alpha fucosidase                                                 
                   Fucose  1-alpha-3,4 to any site                        
__________________________________________________________________________
It will be appreciated that a sequencing agent may be chosen such that the products obtained when it is brought into contact with an unknown oligosaccharide will reveal (in later analysis) the presence or absence of a particular structural sub-unit (e.g. a monosaccharide unit) in the oligosaccharide. It will also be appreciated that by sequentially contacting either samples of the original oligosaccharide or products thereof (produced by previously contacting samples of the oligosaccharide with one or more sequencing reagents) with a sequencing reagent, it is possible to deduce that the unknown oligosaccharide contains a certain structure or structures or does not contain a certain structure or structures.
For example, the presence and linkage of a particular monosaccharide at the terminus of an oligosaccharide may be confirmed by the ability of a given sequencing agent, such as an enzyme, to cause cleavage of that linkage; thus, if cleavage occurs when that enzyme is brought together with the oligosaccharide, then detection of that particular monosaccharide (now detached from the oligosaccharide) in the products of the reaction will confirm the presence of that monosaccharide and will define the position and orientation of the linkage of that monosaccharide in the original oligosaccharide structure. Thus, by sequentially using a plurality of different sequencing agents having specific linkage cleaving capabilities, it is possible to deduce increasing amounts of information regarding the structure of the oligosaccharide under analysis.
Where no single sequencing agent can be found that can increase the amount of information about the structure of the unknown oligosaccharide then consideration can be given to possible combinations of two, three or more agents being applied one after the other; at each stage of consideration an agent which does not produce a meaningful result may be discarded, although the fact that it did not react with the oligosaccharide to produce products may allow deductions to be made regarding the structure of the oligosaccharide. Consideration may also be given to using a combination of two or more agents applied simultaneously.
For the purposes of this specification, a sequencing agent or reagent can be an agent or reagent which either cleaves or forms chemical bonds when brought into contact with an unknown oligosaccharide.
The effectiveness of the sequencing method depends critically upon the choice of agent to be used at each stage of each reaction process and the accuracy of the interpretation of the results of each analysis. At present, to a large degree, a good choice of agent requires an experienced operator who has already made some intelligent guesses about the type of structure being dealt with. A poor choice may result in little or no additional information being revealed by a particular experiment, resulting in the process being wasteful of both time and materials. There is also the danger that the prejudices of the operator will mask ambiguities in the interpretation of the results. Thus, the operator may assign a single structure which is consistent with the experimental results, whereas in reality there may be more than one structure consistent with the same set of observations. A further difficulty arises in defining the point at which no further information can be revealed by the use of the agents available.
The sequencing of oligosaccharides may be assisted by a knowledge of the biosynthetic pathways involved in building up the oligosaccharide structures.
Thus, for example, for N-linked oligosaccharides (N-glycans) it is known that there is a characteristic core structure and that monosaccharides may only add on in certain well defined orders and branching patterns. This knowledge may be used to develop the concept of `basic` or `core` structures for oligosaccharides from which sets of sub-structures can be generated by performing specific transformations on the `basic` structures.
BRIEF DESCRIPTION OF THE DRAWINGS
In the accompanying drawings three `basic` structures are shown respectively in FIGS. 1 to 4. The illustrated `basic` structures are not all the structural possibilities for N-glycans and are shown by way of example only.
FIGS. 5-8 show four groups of structural possibilities with a given set of monosaccharides.
In the case of these structures, the transformation used to generate the sub-structures is successive deletion of terminal monosaccharides. Numerous sub-structures can thus be formed starting with a `basic` structure and, for example, successively deleting monosaccharides in all possible ways. There are several different patterns of deletion of monosaccharides possible leading to a very large number of sub-structures which can be derived from the `basic` structures. The number of sub-structures which can be generated by transformation from the three illustrated sub-structures, is, therefore, very large. In carrying out an analysis of an unknown oligosaccharide, an initial analysis will allow a first judgement to be made that the oligosaccharide belongs to a particular set of sub-structures. This leads to the term "candidate structure" which is used herein to refer to a structure which is included in the set of sub-structures as being a possible candidate for having the same structure as the unknown oligosaccharide.
For example, if the oligosaccharide to be sequenced has been purified from a mixture of oligosaccharides released from a glycoprotein by the use of enzyme peptide-N-glycosidase F, then it can be assumed that the oligosaccharide is an N-glycan and that its structure is likely to be one of the sub-structures generated from `basic` structures similar to the ones illustrated in FIGS. 1 to 3.
However, because such a set of sub-structures can be very large, it is desirable to reduce the size of the initial set of candidate structures as much as possible. One way of approaching this problem is to consider one property of the sample to be sequenced which depends in some way upon its structure. Examples of such properties are:
i) the molar proportion of each monosaccharide (composition analysis)
ii) the retention time of the sample on a chromatographic column, which may be expressed in glucose units
These properties can be calculated for each sub-structure, and only those structures whose properties are the same as those of the sample to be sequenced need be retained for further consideration as candidates for the unknown structure.
The presence and linkage (both position and orientation) of a particular monosaccharide at the terminus of an oligosaccharide can be confirmed if it is known that a given enzyme will cause cleavage of that monosaccharide (and nothing else). If cleavage occurs, then detection of that monosaccharide in the products of the reaction confirms the presence of the linkage in the original structure.
For each enzyme used as a sequencing reagent, it can be established which monosaccharides will be cleaved from a particular `basic` oligosaccharide structure. The rules for such cleavage for a number of enzymes are set out in the Enzymes Rules Table above. By applying rules such as these to each candidate structure it is possible to determine for each candidate structure the monosaccharides which will be cleaved by bringing each enzyme in turn into contact with it.
The present invention provides a way of optimising the use of sequencing agents, interpreting the results unambiguously and determining the point where no further sequencing with the available reagents is possible. According to the present invention there is provided apparatus for sequencing an unknown oligosaccharide, comprising first means for deducing a set of oligosaccharide structures of which the unknown oligosaccharide is assumed to be a member, second means for simulating the effect of applying each one of a series of sequencing agents to each of said set of oligosaccharide structures and third means for determining the sequencing agent likely to give the most structural information if applied to the unknown oligosaccharide.
The invention also provides a method of sequencing an unknown oligosaccharide comprising deducing a set of oligosaccharide structures of which the unknown oligosaccharide is assumed to be a member, simulating the effect of applying each one of a series of sequencing agents to each of said set of oligosaccharide structures and determining the sequencing agent likely to give the most structural information if applied to the unknown oligosaccharide.
Preferably, the invention includes performing an initial analysis whereby to identify at least one basic structure (as herein defined) from which the said set of oligosaccharide structures can be generated.
Preferably also, the invention includes contacting the unknown oligosaccharide with the agent or agents selected in accordance with a set of rules for the use of sequencing agents and comparing the actual experimental result with the calculation.
The method of the present invention postulates an initial set of candidate structures and allows a calculation to be made to determine the most efficient experiment to be performed at any given time. Structures inconsistent with the results of this experiment are then eliminated from the existing set and the next experiment determined based on the structures remaining. The process is repeated until no further elimination is possible with the given sequencing agents.
By performing an exhaustive search among the known products of reaction of all candidate structures with the available sequencing reagents, it is possible to determine the optimum sequencing reagent or series or mixtures of sequencing reagents, such that the largest possible number of candidate structures can be eliminated by performing the reaction with this reagent and determining the products.
In order to determine the most efficient experiment to be carried out, consideration must then be given to the known results of applying each of the available reagents to the remaining candidate structures.
As an example, assuming that the initial composition analysis indicates that the unknown oligosaccharide is a member of a set of candidate structures denoted [A, B, C, D] and that enzymes 1, 2, 3, 4 are the available sequencing reagents. The results of applying these enzymes to these structures can be predicted by applying enzyme rules such as those set out in the Enzyme Rules Table above, to each structure in turn. The rules, for example, might indicate that enzyme 1 will cleave the monosaccharide mannose from each of the candidate structures as follows,
Candidate A, enzyme 1 cleaves 1 mannose
Candidate B, enzyme 1 cleaves 1 mannose
Candidate C, enzyme 1 cleaves 0 mannose
Candidate D, enzyme 1 cleaves 2 mannose
Assume, for example, that the full results of applying each enzyme to each candidate structure are as follows,
______________________________________                                    
Candidates                                                                
        A         B          C       D                                    
______________________________________                                    
Enzyme 1                                                                  
        1 mannose 1 mannose  0 mannose                                    
                                     2 mannose                            
Enzyme 2                                                                  
        1 fucose  0 fucose   0 fucose                                     
                                     0 fucose                             
Enzyme 3                                                                  
        2 galactose                                                       
                  2 galactose                                             
                             2 galactose                                  
                                     2 galactose                          
Enzyme 4                                                                  
        1 mannose 0 mannose  0 mannose                                    
                                     2 mannose                            
______________________________________                                    
The question to be answered in this example is, which, if any, of the candidate set [A, B, C, D] is the actual structure of the unknown sample. It must be determined which of the available sequencing reagents (in this example enzymes 1 to 4) is the best one to use to help find the answer to that question most efficiently. This is done by considering a Breakdown Result (a) for each enzyme derived from data in the above tables. A Breakdown Result groups the candidate structures into subsets where all the members of that subset give the same result in terms of the number of monosaccharides cleaved when reacted with the enzyme.
Also, for each Breakdown Result (a), a Breakdown Entropy (b) can be calculated to give a numerical measure of the efficiency of each enzyme. The Breakdown Entropy (b) is the sum over all the subsets in a Breakdown Result (a) of the term,
--N(i)* log N(i)
where N(i) is the length of the ith subset, i.e. the number of candidate structures in the ith subset.
The above table of data can therefore be expressed in terms of Breakdown Results (a) and Breakdown Entropy (b),
______________________________________                                    
(a)     enzyme 1, subset ([A, B],1), subset ([C],0),                      
        subset ([D],2)                                                    
(b)     entropy 1 = -(2 * log2 + 1 * log1 + 1 * log1)                     
        = -0.602                                                          
(a)     enzyme 2, subset ([A],1), subset ([B, C, D],0)                    
(b)     entropy 2 = -(1 * log1 + 3 * log3) = -1.431                       
(a)     enzyme 3, subset ([A, B, C, D],2)                                 
(b)     entropy 3 = -(4 * log4) = -2.408                                  
(a)     enzyme 4, subset ([A, D],1), subset ([B, C],2)                    
(b)     entropy 4 = -(2 * log2 + 2 * log2) = -1.204                       
______________________________________                                    
The Breakdown Entropy of an enzyme is a direct measure of the scatter of the structures in the candidate set among different possible experimental outcomes from applying that enzyme. When all experimental outcomes are the same, the Breakdown Entropy has its minimum value (as in the above example for enzyme 3), and it is clear that it is impossible to distinguish between the candidate structures by performing that experiment. Conversely, if all experimental outcomes are distinct, the Breakdown Entropy is identically zero, which is its maximum possible value, and performing that experiment will identify uniquely the structure being sought. In general, the enzyme with the maximum Breakdown Entropy is chosen so as to give the highest statistical chance of rapidly eliminating structures from the set of candidates. In the example above therefore, enzyme 1 would be chosen.
The Breakdown Entropy is the primary index of evaluation in the search for the best sequencing experiment. An enzyme can be eliminated from further consideration in the search under the following criteria,
A) if the enzyme would not cleave any monosaccharides from any of the candidate structures,
B) if the Breakdown Entropy is less than that of some other enzyme,
C) if the enzyme cleaves more monosaccharides than some other enzyme which has an equal Breakdown Entropy,
D) if the enzyme is more expensive to use than another enzyme which offers an equal result.
If a first generation search (i.e. a search on the original sample) using all available enzymes, yields no single enzyme with a Breakdown Entropy greater than the minimum, then a second generation search (i.e. a search on the residue from the first generation search) using two enzymes, either together or in sequence is performed, and failing that, third generation, fourth generation, etc., searches are performed, until the point is reached where the candidate structures are deemed indistinguishable because all enzyme mixtures being tested would reduce all candidate structures to a common residue. At this point it is possible to conclude that there is no experiment with the available sequencing reagents which could further reduce the size of the set of candidate structures.
The method can be carried out as a series of logical steps, not all of which need to be performed in every case:
1. Initialisation
1.1 Specify a set of `basic` structures 1.2 Specify sample composition 1.3 Generate all the sub-structures from the `basic` structures in 1.1 having the composition given in 1.2. These form the initial set of candidate structures
2. Calculation
2.1 Search for the best reagent
2.1.1 For each reagent calculate a Breakdown Result listing the number of monosaccharides cleaved from each `basic` structure by that reagent.
2.1.2 Compare the results using criteria A to D above and select the best Breakdown Result,
2.1.2.1 If no Breakdown Result is found which contains two or more subsets of residues which give different results when the reagent is applied, then select the next best Breakdown Result.
2.1.2.2 If no single reagent can be found that can distinguish between the candidate structures, then the search is repeated using all possible series of two, three or more reagents applied one after the other. At each stage, reagents which have no effect on any of the candidate structures are eliminated from the search. If a series of one or more sequencing reagents is found which produces a distribution entropy greater than the minimum, then this series of experiments is chosen to be performed sequentially on the sample.
a. Experiment 3
3.1 Carry out the experiment(s) indicated by 2. above
4. Result
Check the results against predictions,
4.1 Compare the number of monosaccharides cleaved using the recommended reagent with the prediction for each subset in the Breakdown Result for that enzyme.
4.2 If the sample is identified as a member of one of the subsets of structures in the Breakdown Result in that it matches in the number of monosaccharides cleaved with the experimental result, it may be that further information can be obtained. A further set of calculations can be performed by re-defining the set of candidate structures as the subset of structures identified in the Breakdown Result.
4.3 If no subset of the Breakdown Result matches the experimental result then it is likely that either,
i) the experiment has failed, or
ii) the initial assumptions about the `basic` structures were invalid.
Overall
It is clear that the amount of calculation to be performed can be considerable. The method is, therefore, best performed using a computer and a program can be written to carry out the calculations and allow the user to make the choices needed to select the experiments. However, the kind of decisions to be made are essentially those which an expert can perform and a computer expert system has been developed to perform the logical operations involved.
To test the accuracy of this computer expert system a known, pure oligosaccharide was analysed using, for reference, the three `basic` structures shown in FIGS. 1 to 3. The total number of structural variants generated from the three `basic` structures by cleaving monosaccharides in all known ways from these three oligosaccharides, is about 70,000.
The sample used for analysis was NA2F, which has the composition shown in FIG. 4.
Monosaccharide composition analysis of the sample gave,
Mannose=3
Galactose=2
GlcNAc=4
Fucose=1
Four groups of structural possibilities can be derived from a composition having these monosaccharides in these proportions and these four structures are shown in outline respectively in FIGS. 5, 6, 7 and 8. In addition, the structures shown in these Figures give multiple variants in the following numbers,
FIG. 5-9 variants
FIG. 6-18 variants
FIG. 7-12 variants
FIG. 8-6 variants
Within each group, all variants of bond numbering are included and both core and outer arm fucosylation are accounted for. The above variants give a set of 45 candidates structures and this was confirmed by the computer expert system, which successively recommended the use of the following enzymes with the results shown,
______________________________________                                    
             Monosaccharide                                               
                         Candidate structures                             
Enzyme to use                                                             
             detected    remaining                                        
______________________________________                                    
Jack bean alpha                                                           
             None        27                                               
man (non arm-                                                             
specific)                                                                 
Bovine testis                                                             
             Gal = 2     9                                                
beta gal                                                                  
Dip.P.beta hex                                                            
             GlcNAc = 2  1                                                
______________________________________                                    
The variants of FIGS. 7 and 8 are eliminated by the first experiment, because these would yield mannose if treated with the alpha-mannosidase, whereas the putative test sample yielded none. The variants of FIG. 6 are eliminated by the second experiment, which yielded two galactose, whereas the presence of the outer arm fucose branch in FIG. 6 blocks the action of the galactosidase on that branch. Finally, of the nine remaining structures of FIG. 5, only the structure given would yield two N-Acetyl Glucosamine entities when reacted with the enzymes in the sequence given.
In this sequence, the structure is uniquely identified.

Claims (13)

We claim:
1. Apparatus for sequencing an unknown oligosaccharide, comprising first means for deducing a set of oligosaccharide structures of which the unknown oligosaccharide is assumed to be a member, second means for simulating the effect of applying each one of a series of sequencing agents to each of said set of oligosaccharide structures and third means for determining the sequencing agent likely to give the most structural information if applied to the unknown oligosaccharide.
2. Apparatus as claimed in claim 1, in which the first means comprises means for determining a property of the unknown oligosaccharide which depends on its structure.
3. Apparatus as claimed in claim 2, in which the property is the number and type of each monosaccharide in the unknown oligosaccharide.
4. Apparatus as claimed in claim 2, in which the property is the retention time of said unknown oligosaccharide on a liquid chromatography column.
5. Apparatus as claimed in claim 1, in which the second means comprises means for consulting a set of rules which determine the effect of applying each of said sequencing agents to an oligosaccharide structure.
6. Apparatus as claimed in claim 5, in which the third means includes means for performing an exhaustive search among the predicted products of reaction of among the predicted products of reaction of all said set of oligosaccharide structures.
7. Apparatus as claimed in claim 6, in which the third means comprises means for calculating the distribution of the predicted products of reaction for all the structures in the set of oligosaccharide structures.
8. A method of sequencing an unknown oligosaccharide comprising deducing a set of oligosaccharide structures of which the unknown oligosaccharide is assumed to be a member, simulating the effect of applying each one of a series of sequencing agents to each of said set of oligosaccharide structures and determining the sequencing agent likely to give the most structural information if applied to the unknown oligosaccharide.
9. A method as claimed in claim 8, wherein deducing the set of oligosaccharide structures comprises identifying at least one basic structure from which a set of sub-structures can be generated by deleting monosaccharides from the at least one basic structure in all possible ways, and further identifying a set of candidate structures from among the set of sub-structures of which the unknown oligosaccharide is assumed to be a member.
10. A method as claimed in claim 8, including calculating the distribution of the predicted products of reaction for all the oligosaccharide structures in the set of oligosaccharide structures by consulting a set of rules which determine the effect of applying each of said sequencing agents to an oligosaccharide structure.
11. A method as claimed in claim 10, including conducting an experiment by contacting the unknown oligosaccharide with the agent or agents selected in accordance with the set of rules and comparing the actual experimental result with the calculation.
12. A method as claimed in claim 11, including eliminating structures from the candidate set which are inconsistent with the results obtained from an experiment and determining the next experiment based on the structures remaining.
13. A method as claimed in claim 12, in which the process is repeated until no further elimination is possible with the given sequencing agents.
US08/140,143 1991-05-07 1992-05-07 Sequencing of oligosaccharides Expired - Fee Related US5427744A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
GB9109842 1991-05-07
GB919109842A GB9109842D0 (en) 1991-05-07 1991-05-07 Sequencing of oligosaccharides
PCT/GB1992/000831 WO1992019766A1 (en) 1991-05-07 1992-05-07 Sequencing of oligosaccharides

Publications (1)

Publication Number Publication Date
US5427744A true US5427744A (en) 1995-06-27

Family

ID=10694565

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/140,143 Expired - Fee Related US5427744A (en) 1991-05-07 1992-05-07 Sequencing of oligosaccharides

Country Status (6)

Country Link
US (1) US5427744A (en)
EP (1) EP0584128B1 (en)
JP (1) JPH06506828A (en)
DE (1) DE69221861T2 (en)
GB (1) GB9109842D0 (en)
WO (1) WO1992019766A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6582965B1 (en) 1997-05-22 2003-06-24 Oxford Glycosciences (Uk) Ltd Method for de novo peptide sequence determination
US6963807B2 (en) 2000-09-08 2005-11-08 Oxford Glycosciences (Uk) Ltd. Automated identification of peptides

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9814420D0 (en) * 1998-07-03 1998-09-02 Cancer Res Campaign Tech Sequence analysis of saccharide material
JP6360738B2 (en) * 2014-07-10 2018-07-18 学校法人 愛知医科大学 Method for determining the sequence structure of glycosaminoglycan sugar chains

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0421972A2 (en) * 1989-10-03 1991-04-10 Oxford Glycosystems Ltd. Oligosaccharide sequencing
WO1992002816A1 (en) * 1990-07-27 1992-02-20 University Of Iowa Research Foundation Electrophoresis-based sequencing of oligosaccharides

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0421972A2 (en) * 1989-10-03 1991-04-10 Oxford Glycosystems Ltd. Oligosaccharide sequencing
US5100778A (en) * 1989-10-03 1992-03-31 Monsanto Company Oligosaccharide sequencing
WO1992002816A1 (en) * 1990-07-27 1992-02-20 University Of Iowa Research Foundation Electrophoresis-based sequencing of oligosaccharides

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Joseph K. Welply: "Sequencing methods for carbohydrates and their biological applications"; TIBTECH; 7 (1989) Jan.; pp. 5-10.
Joseph K. Welply: Sequencing methods for carbohydrates and their biological applications ; TIBTECH; 7 (1989) Jan.; pp. 5 10. *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6582965B1 (en) 1997-05-22 2003-06-24 Oxford Glycosciences (Uk) Ltd Method for de novo peptide sequence determination
US6963807B2 (en) 2000-09-08 2005-11-08 Oxford Glycosciences (Uk) Ltd. Automated identification of peptides

Also Published As

Publication number Publication date
EP0584128A1 (en) 1994-03-02
EP0584128B1 (en) 1997-08-27
JPH06506828A (en) 1994-08-04
DE69221861T2 (en) 1998-01-02
GB9109842D0 (en) 1991-06-26
DE69221861D1 (en) 1997-10-02
WO1992019766A1 (en) 1992-11-12

Similar Documents

Publication Publication Date Title
Edge et al. Fast sequencing of oligosaccharides: the reagent-array analysis method.
Tissot et al. Glycoproteomics: past, present and future
US8036834B2 (en) Multiparameter analysis for predictive medicine
Joshi et al. Development of a mass fingerprinting tool for automated interpretation of oligosaccharide fragmentation data
US7117100B2 (en) Method for the compositional analysis of polymers
JP4988889B2 (en) Polysaccharide structure and sequencing
US5100778A (en) Oligosaccharide sequencing
JP4696100B2 (en) Methods for comparative analysis of carbohydrate polymers
EP0852004A2 (en) Multiplexed analysis of clinical specimens
EP0143209A2 (en) Method for the immunological determination of extracellular matrix proteins in body fluids using monovalent antibody fragments
US5427744A (en) Sequencing of oligosaccharides
Benner et al. Post-genomic science: converting primary structure into physiological function
JP2002544485A5 (en)
US5667984A (en) Sequencing of oligosaccharides
US5753454A (en) Sequencing of oligosaccharides: the reagent array-electrochemical detection method
Butler et al. DNA fingerprinting in Speke's gazelle: a test for genetic distinctness, and the correlation between relatedness and similarity
EP0586419A1 (en) Sequencing of oligosaccharides
MacIntyre et al. Evolution of acid phosphatase-1 in the genus Drosophila. Immunological studies
CA1303984C (en) Method for measuring hyaluronic acid
Gavériaux et al. An enzyme-linked lectin-binding assay on cells (CELLBA) for the comparison of lectin receptor expression on cell surfaces
Werner et al. Optimization of diagnostic discrimination applied to the amniotic fluid lecithin/sphingomyelin ratio
Haeckel Carry-Over Effects from Reagent to Reagent Verschleppungseffekte von Reagenz zu Reagenz
Abdelrahman et al. Synovitis evaluation in Egyptian patients with early rheumatoid arthritis
EP0574844A1 (en) Method for the automated determination of isoenzymatic profiles and relevant equipment
CN101329352A (en) Method for detecting trisomy 13 and down syndrome by non-invasive maternal blood screening

Legal Events

Date Code Title Description
AS Assignment

Owner name: OXFORD GLYCOSYSTEMS LTD, UNITED KINGDOM

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PAREKH, RAJESH BHIKHU;PRIME, SALLY BARBARA;REEL/FRAME:006914/0026;SIGNING DATES FROM 19931109 TO 19931112

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FEPP Fee payment procedure

Free format text: PAT HLDR NO LONGER CLAIMS SMALL ENT STAT AS SMALL BUSINESS (ORIGINAL EVENT CODE: LSM2); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20070627