Sortzzy is a utility module which provides a simple way to fuzzy sort an array of JSON objects based on a target model and a set of weighted field descriptors. Strings in the data set can be compared against the model with the built in Levenshtein Distance algorithm. Numerics can be compared by their distance to a given number with a bounding range.
This utility was created out of a requirement to find the best matching song given a model to start with. The problem was that song titles, album titles and artist names don't always match and I needed to also take into account numeric data like track times.
Given some song data:
var data = [
{
artistName: 'Justin Bieber',
collectionName: 'One Time (My Heart Edition) - Single',
trackName: 'One Time (My Heart Edition)',
trackTimeMillis: 191697,
},
{
artistName: 'Justin Bieber',
collectionName: 'My Worlds Acoustic',
trackName: 'One Time',
trackTimeMillis: 186267,
},
{
artistName: 'Justin Bieber',
collectionName: 'Radio Disney Jams 12',
trackName: 'One Time (My Heart Edition)',
trackTimeMillis: 190667,
},
{
artistName: 'The Justin Bieber Tribute Band',
collectionName: 'One Time - Single',
trackName: 'One Time',
trackTimeMillis: 240148,
}
. . .
]
var sortzzy = require('sortzzy')
// Create the model to match against
var model = {
artistName : 'justin bieber',
trackName : 'One Time',
trackTimeMillis : 190000
}
// Define the fields
var fields = [
{name:'artistName', type:'string', weight:1, options:{ignoreCase:true}},
{name:'trackName', type:'string', weight:1, options:{ignoreCase:true}},
{name:'trackTimeMillis', type:'numeric', weight:2, fixedRange:[160000, 220000]}
]
var result = sortzzy.sort(data, model, fields);
/*
result[0] ==
{
score: 0.9688916666666667,
data: {
artistName: 'Justin Bieber',
collectionName: 'My Worlds Acoustic',
trackName: 'One Time',
trackTimeMillis: 186267
}
}
*/
Releases are available for download from GitHub. Alternatively, you can install using Node Package Manager (npm):
npm install sortzzy
Scores each item in the array as it relates to the given model using the array of field descriptors. Returns either a new array with a score element and the original data in a data element, or a new array sorted by the score, but without it being included.
Arguments
arr - An array of JSON objects.
model - A JSON object that is the model of the item you are looking for.
fields - An array of field descriptors. Each field descriptor can have the following
[0,100]
options -
Same as sort() but only returns the score for a single object compared against model.
Performs the levenshtein distance algorithm between stringX and stringY.
Options
Same as levenshtein but returns a score between 0 and 1.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。