package-info.java example

Explorer
smile-master
/*******************************************************************************
 * Copyright (c) 2010 Haifeng Li
 *   
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *  
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *******************************************************************************/

/**
 * Nearest neighbor search. Nearest neighbor search is an optimization problem
 * for finding closest points in metric spaces. The problem is: given a set S
 * of points in a metric space M and a query point q ∈ M, find the closest
 * point in S to q.
 * <p>
 * The simplest solution to the NNS problem is to compute the distance from
 * the query point to every other point in the database, keeping track of
 * the "best so far". This algorithm, sometimes referred to as the naive
 * approach, has a running time of O(Nd) where N is the cardinality of S
 * and d is the dimensionality of M. There are no search data structures
 * to maintain, so linear search has no space complexity beyond the storage
 * of the database. Surprisingly, naive search outperforms space partitioning
 * approaches on higher dimensional spaces.
 * <p>
 * Space partitioning, a branch and bound methodology, has been applied to the
 * problem. The simplest is the k-d tree, which iteratively bisects the search
 * space into two regions containing half of the points of the parent region.
 * Queries are performed via traversal of the tree from the root to a leaf by
 * evaluating the query point at each split. Depending on the distance
 * specified in the query, neighboring branches that might contain hits
 * may also need to be evaluated. For constant dimension query time,
 * average complexity is O(log N) in the case of randomly distributed points.
 * Alternatively the R-tree data structure was designed to support nearest
 * neighbor search in dynamic context, as it has efficient algorithms for
 * insertions and deletions.
 * <p>
 * In case of general metric space branch and bound approach is known under
 * the name of metric trees. Particular examples include vp-tree and BK-tree.
 * <p>
 * Locality sensitive hashing (LSH) is a technique for grouping points in space
 * into 'buckets' based on some distance metric operating on the points.
 * Points that are close to each other under the chosen metric are mapped
 * to the same bucket with high probability.
 * <p>
 * The cover tree has a theoretical bound that is based on the dataset's
 * doubling constant. The bound on search time is O(c12 log n) where c is
 * the expansion constant of the dataset.
 * 
 * @author Haifeng Li
 */
package smile.neighbor;