문제 설명
geo 및 int의 기능에 의한 스마트 정렬 (Smart sorting by function of geo and int)
I'm thinking about ways to solve the following task.
We are developing a service (website) which has some objects. Each object has geo field (lat and long). It's about 200‑300 cities with objects can be connected. Amount of objects is thousands and tens of thousands.
Also each object has date of creation.
We need to search objects with sorting by function of distance and freshness.
E.g. we have two close cities A and B. User from city A authorizes and he should see objects from city A and then, on some next pages, from city B (because objects from A are closer). But, if there is an object from A which was added like a year ago, and an object from B which was added today, then B's object should be displayed befare A's one.
So, for peoeple from city A we can create special field with relevant index like = 100*distance + age_in_days And then sort by this field and we will get data as we need.
The problem is such relevant index will not work for all other people from other places.
In my example i used linear function but it's just an example, we will need to fit correct function.
The site will work on our servers, so we can use almost any database or any other software (i supposed to use mongodb)
참조 솔루션
방법 1:
I have following ideas
Recacl relevant index every day and keep it with object like
{ fields : ..., relindex : { cityA : 100, cityB : 120 } }
And if user belongs to cityA then sort by relindex.cityA
Disadvantages:
- Recurrent update of all object, but i dont think it's a hude problem
- Huge mongo index. If we have about 300 cities than each object will have 300 indexed fields
- Hard to add new cities.
- Use 3d spatial index: (lat, long, freshness). But i dont know if any database supports 3d geo‑patial
- Compact close objects in cluster and search only in cluster but not by whole base. But im not sure that it's ok.
방법 2:
I think there are four possible solutions:
1) Use 3D index ‑ lat, lon, time.
2) Distance is more important ‑ use some geo index and select nearest objects. If the object is too old then discard it and increase allowed distance. Stop after you have enough objects.
3) Time is more important ‑ index by time and discard the objects which are too far.
4) Approximate distance ‑ choose some important points (centre of cities or centre of clusters of objects) and calculate the distances from these important points up front. The query will first find the nearest important point and then use index to find the data. Alternatively you can create clusters from your objects and then calculate the distance in the query. The point here is that the amount of clusters is limited.
(by Iliya Garakh、Iliya Garakh、Strix)