Similarity Manager

Brief Description


The Similarity Manager is a desktop Windows application developed to provide a simple and quick solution for handling of the structural similarity and diversity of moderately large virtual compound sets.


Features at a Glance


Molecular diversity is one of the most important characteristics for screening libraries. To increase the hit rate, the use of the most representative compound set that covers the chemical space relevant to the appropriate target is advised. To select the most favorable screening library a simple measure that explicitly informs the medicinal chemist about the diversity of the investigated compound has been introduced.

A novel diversity assessment method, the Explicit Diversity Index (EDI), is introduced for drug-like molecules. EDI combines structural and synthesis-related dissimilarity values and expresses them as a single number. As an easily interpretable measure it facilitates the decision making in the design of combinatorial libraries, and may assist in the comparison of compound sets provided by different manufacturers. Through its rapid calculation algorithm, EDI enables the diversity assessment of in-house or commercial compound collections.

Besides the EDI index, the Similarity Manager calculates the most common similarity indices which quantify the structural diversity of the compound set: the program defines the TAT (Total Average Tanimoto) and NAT ( Nearest Average Tanimoto) indices also.


Diverse selection


Selection of representative, diverse subsets from large compound collections has a key role in the initial design phase of successful screening procedures. Nowadays, drug candidate library providers offer several hundreds of thousands of compound structures in a single collection, which makes a rapid, but still accurate selection indispensable.

The wide spread use of combinatorial chemistry has made the selection of diverse subsets from virtual libraries common practice. In the case of relatively small compound sets, the very simple maximum dissimilarity algorithm can provide an approximate optimal selection. However, the calculation demand of the algorithm is proportional to the size of the candidate pool set and quadratically proportional to the size of selected subset, so this method is not applicable for large compound sets. The OptiSim algorithm is one of the most commonly known modifications to this method and several commercially available softwares apply this algorithm. The OptiSim algorithm considers a randomly selected subsample of candidates instead of whole pool set. This solution can accelerate the selection process remarkably, but the diversity of the selected subset is obviously smaller than that of the previous algorithm.

Similarity Manager 1.0 includes a novel diverse selection algorithm that has been developed to accelerate the time-consuming selection procedure without relaxing the diversity of the selected subset.


Analog Search


The properties of hash fingerprints dictate that the fingerprints of similar structures will also be similar. The reverse of this statement is not true in all cases, but in practice, the structures belonging to similar fingerprints is highly likely to be similar. Thus, on the basis of fingerprints we can easily perform so-called analog searches: the structures similar to specified compounds can be searched quickly on the basis of the similarity of the fingerprints. The Analog Search module of the Similarity Manager 1.0 is able to identify the compounds that meet the specified similarity criteria.


Similarity Manager is operating under Windows computing platform.


If you need more information about our product, please e-mail CompuDrug International at