In order to determine a macromolecular crystal structure, a number of processes have to be performed in a linear fashion with each stage dictating the outcome of the next. Initially, good quality protein crystals have to be made that are amenable to X-ray diffraction analysis. The routine use of cryopreservation techniques  over the last 10 years has meant that the required size of usable protein crystal has decreased, removing much of the burden of teasing out ordered growth conditions for making relatively large crystals. In addition, the physical damage to protein crystals from high-energy X-ray irradiation is greatly reduced by maintaining crystals frozen in conditions where ice crystal formation is prevented. This allows data sets to be collected from crystals as small as 10 mm in size, whereas previously crystals had to be larger to allow for major damage during the X-ray irradiation procedure.
Advancements in the optical devices used to focus high-energy X-ray beams at the crystal have matched the advances in crystal cryopreservation meaning that good quality data sets can also be collected. Traditionally, particle accelerator facilities, known as synchrotrons, have been used as a source of very high-energy X-rays for protein crystallography. While this is still the case for allowing very rapid collection of good quality data sets, the
usable X-ray flux of home sources has also increased significantly as a combination of improvements that have been made to rotating anode generators, X-ray optical systems and detector systems. This improvement to in-house systems means that a good laboratory single crystal X-ray diffraction machine can now be used for relatively rapid data collection, so that data which may have taken several days to collect on older systems can now be collected in as little as a couple of hours. The increase in productivity of in-house systems has meant that the process of putting the crystal in the beam can be a bottleneck during 24/7 operation. For this reason, automated systems have been developed to replace this manual operation, taking crystal containing cryoloops from liquid nitrogen storage and placing them in the X-ray beam prior to data collection . Some synchrotron facilities now also have automated crystal-mounting systems that can be utilised in a remote fashion. In this process, a user cryopreserves many crystals on data collection loops, places them in a specialist carrier cassette, and then sends them over to the remote synchrotron in a cryo-storage container by courier. The data are collected and the data sets returned by courier or via the Internet without the scientist needing to set foot outside their own laboratory.
The number and quality of the software tools available, coupled with the availability of cheap computational processing, means that the data analysis part of the structure-determination process is greatly enhanced with ever decreasing timelines . The events required for structural solution are the processing of collected data sets, structural solution, molecular refinement and building the structural model. The molecular replacement and model building are continued in cycles until the model is of acceptable quality to the structural biologist. On attempting to derive a novel structure, the parts of the process that can be bottlenecks are the structural solution and the model building, especially in the case of relatively large proteins of >50 kDa. Structural solution is becoming far more straightforward with the increasing numbers of proteins deposited in the public protein database (at the time of writing there were 33,000 macromolecular structures). The greater the number of available structures, the greater the probability that there is a homologous structure, which can be used as a computational search model to provide a structural solution to a novel protein by molecular replacement, a very straightforward and rapid computational process. Where no homologous structure is available, more direct experimental methods need to be performed in order to provide data for direct phase solution, which can add significant time to a project.
Once a structure of the desired protein has been solved, it is a very rapid process to produce subsequent high-quality structures and, in fact, some groups have even linked various scripts together, or modified software tools to provide much more automated software aids to repeated crystal structure solution, such as when solving multiple ligand complexes of the same protein .
Despite the major advances in X-ray crystallography, the process of producing multiple protein-ligand complexes is not able to handle anywhere near the same number of compounds as high throughput compound screening (HTS) methods. Ambitious X-ray structure campaigns can produce hundreds of protein-ligand complexes in several months , whereas HTS can handle at least a million compounds over the same time period using in vitro screening technologies. The use of X-ray crystallography as a direct screening tool for conventional compound libraries is therefore limited, especially as corporate collections tend to range from 100,000 to several million unique compounds . A method that has been developed in an attempt to enable X-ray structure determination to be used as a primary screening tool has been the use of compound cocktails in protein crystal soaking experiments . This method relies on a supply of well-diffracting protein crystals amenable to ligand soaking. This means that the packing of the molecules in the crystalline matrix of the protein crystal is arranged so that ligand can easily diffuse into the active site of many of the target molecules making up the crystal, without causing the structural movements of the protein, which could reduce the crystal diffraction quality. Each compound cocktail is selected so that members are diverse in shape to facilitate identification of the 'hit' (Figure 1.2).
Once the crystals have been soaked in compound, rapid X-ray data are collected sufficient to allow the generation of an electron density map. The difference in density between empty unsoaked crystals and those where a
Compound library cocktail
Further lead optimisation chemistry
Compound identified & binding mode determined
Binding site electron density solved
Fig. 1.2 Schematic diagram to show the process involved in crystallographic screening of compound cocktails.
compound has bound provides ligand electron density that allows the identification of the bound compound from the cocktail. This process is only possible due to the diverse nature of the compounds in each cocktail, which unambiguously allows the density to be assigned to a specific shape of compound. This technique filters out the most potent-binding ligands from the mixture. To identify less potent ligands in the same mixture, this compound would have to be removed from the cocktail and re-screening performed.
The number of compounds in each cocktail is limited by two factors. Firstly, sufficient shape diversity to allow unambiguous identification of compound by electron density at the structural resolution must be achieved. Secondly, the concentration of each compound in the cocktail has to be high enough to allow a relatively high occupancy in the protein crystal, and that saturating concentration decreases as the number of compounds in the cocktail increases. Despite these limitations, this technique has been used to screen cocktails containing as many as 100 compounds, which potentially allows the screening of low thousands of compounds per day on some in-house systems and potentially more at synchrotron facilities. This cry-stallographic screening method is a tool to complement existing approaches and certainly has the potential of bringing structural data to the early phase of drug discovery.
Another crystallographic screening method is one known by a number of names including needle, shape or fragment screening [11, 12] where compounds, in many cases much smaller than conventional high throughput screening compounds, are used in structure solution campaigns to provide hits for structure-based design chemistry programmes. The philosophy behind using smaller molecules is that these relatively simple chemicals can be used to probe more binding sites than conventional larger high throughput screening compounds, due to their lack of extensive functionality. For example, an inappropriate side chain, which caused steric or electronic clashes within an active site, could render an exciting chemical scaffold inactive in a screen. A simple model of ligand-receptor interactions  has proposed that the probability of a compound binding to a receptor decreases as the molecular weight of the compound increases. For example, the authors show from statistical modelling that for a binding site with a complexity score of 12, the probability of randomly selecting a compound with a unique binding mode is 39.9% for a ligand of complexity 3, but only 1% for a ligand with a complexity of 11. Generally, compounds of lower molecular weight and complexity are less potent binders, with activities typically between 100 mM and 10 mM, as they are usually capable of making fewer interactions with the binding site. This means that selection by screening and subsequent chemistry programmes have to be modified to accommodate a different approach than hits discovered by conventional high throughput screening, where compounds of less than 10 mM activity are sought.
One major problem with screening compounds at high assay concentration is that they can aggregate in solution and associate and inhibit enzymes non-selectively . These effects have been observed for a number of generic protein kinase inhibitors, which can form aggregates in solution, becoming promiscuous inhibitors and reduce the enzyme activity of non-ATP-dependent enzymes . Fortunately, this effect can be monitored and also largely prevented by running identical assays in the presence of detergent, which reduces the effect by reducing compound aggregate formation [16, 17]. The effect of light absorbing compounds can be a problem in fluorometric kinase assays, so high compound concentration kinase assays should either be radiometric or involve a compound wash out step prior to signal detection.
A multi-stage screening approach was adopted for developing ATP site inhibitors of DNA gyrase . In this approach, the authors first used an in silico screen using a model structure to select an initial compound screening set of 600 compounds. This set was increased to 3,000 by the addition of close analogues. Many of these compounds had an MW <300 and none were selected with an MW >400. The authors referred to these compounds as needles, because they were small enough to have been able to penetrate deep into active site pockets and clefts. An in vitro assay was then used which was selected to tolerate compound assay concentrations of 0.5 mM. The high concentration in vitro screen identified 150 hits with significant inhibition activity.
The objective of the overall screening approach was to identify compounds whose mode of action was through binding in the ATP-binding pocket. For this reason the 24kDa N-terminal fragment of DNA gyrase subunit B, which contains the ATP-binding site of the enzyme was used for further studies. These studies initially consisted of analytical centrifugation and surface plasma resonance, with the more interesting compounds also being analysed by 15N-labelled protein in NMR experiments. The use of NMR allowed the identification of residues within the active site that had a shifted N-H signal upon ligand binding and so more clearly defined how different series of compounds were binding in the ATP-binding site. For final confirmation of binding, and as the aid to structure-guided design, X-ray crystal structures were then generated with a loop-deleted version of the ATP-binding subunit fragment, which was amenable to crystallography. This approach allowed the authors to develop novel inhibitors 10-fold more potent than the clinical competitor drug using their structure-based chemistry iterations, increasing the potency of an original weak indazole needle hit by a factor of 266 (Figure 1.3).
After Lead Optimisation
Was this article helpful?