|
Similarity Manager
Similarity Manager
The
Similarity Manager is a desktop Windows application developed to provide a
simple and quick solution for handling of the structural similarity and
diversity of moderately large virtual compound sets.
Diversity Assessment
Molecular
diversity is one of the most important characteristics for screening
libraries. To increase the hit rate, the use of the most representative
compound set that covers the chemical space relevant to the appropriate target
is advised. To select the most favorable screening library a simple measure
that explicitly informs the medicinal chemist about the diversity of the
investigated compound has been introduced.
A
novel diversity assessment method, the Explicit Diversity Index (EDI), is
introduced for drug-like molecules. EDI combines structural and
synthesis-related dissimilarity values and expresses them as a single number.
As an easily interpretable measure it facilitates the decision making in the
design of combinatorial libraries, and may assist in the comparison of compound
sets provided by different manufacturers. Through its rapid calculation
algorithm, EDI enables the diversity assessment of in-house or commercial
compound collections.
Besides
the EDI index, the Similarity Manager calculates the most common similarity
indices which quantify the structural diversity of the compound set: the
program defines the TAT (Total Average Tanimoto) and NAT ( Nearest Average
Tanimoto) indices also.
Diverse selection
Selection of
representative, diverse subsets from large compound collections has a key role
in the initial design phase of successful screening procedures. Nowadays, drug
candidate library providers offer several hundreds of thousands of compound
structures in a single collection, which makes a rapid, but still accurate
selection indispensable.
The
wide spread use of combinatorial chemistry has made the selection of diverse
subsets from virtual libraries common practice. In the case of relatively small
compound sets, the very simple maximum dissimilarity algorithm can provide an
approximate optimal selection. However, the calculation demand of the algorithm
is proportional to the size of the candidate pool set and quadratically
proportional to the size of selected subset, so this method is not applicable
for large compound sets. The OptiSim algorithm is one of the most commonly
known modifications to this method and several commercially available softwares
apply this algorithm. The OptiSim algorithm considers a randomly selected
subsample of candidates instead of whole pool set. This solution can accelerate
the selection process remarkably, but the diversity of the selected subset is
obviously smaller than that of the previous algorithm.
Similarity
Manager 1.0 includes a novel diverse selection algorithm that has been
developed to accelerate the time-consuming selection procedure without relaxing
the diversity of the selected subset.
Analog Search
The
properties of hash fingerprints dictate that the fingerprints of similar
structures will also be similar. The reverse of this statement is not true in
all cases, but in practice, the structures belonging to similar fingerprints is
highly likely to be similar. Thus, on the basis of fingerprints we can easily
perform so-called analog searches: the structures similar to specified
compounds can be searched quickly on the basis of the similarity of the
fingerprints. The Analog Search module of the Similarity Manager 1.0 is able to
identify the compounds that meet the specified similarity criteria.
|