The intelligent solution that turns to be unadequate
Recenlty I asked for advice on StackOverflow with a question related to sorting information on a Django application.
I copy the question here:
I’m trying to do something similar to this, in Django. This is part of the page of Anna:
Pos NickName Points --- --------- ------ 1 The best 1000 ... 35 Roger 550 36 Anna 545 37 Paul 540
It’s a chart showing the scoring system, and it intends to show the first position, as well as the relative position of the presented player.
Showing the first one it’s easy, as it’s only making a query to the database and extracting the first one:
best = Score.objects.all().order_by('-points')[0]
But I’m having problems getting the ones close to the presented player (Anna, in this case). I don’t want to go searching through the complete list, as the complete list of players can be quite long.
Maybe there’s a way to know the position a register occupies in an ordered list…
Any ideas on how to achieve it?
Well, I get with a clever solution that was, well, count the number of players with higher number of points than the actual player, so you’ll get the position. It’s definitively more elegant than rendering the complete list and then searching for the position of a particular player.
The problem is that it won’t work when you have several players with exactly the same number of points. In that case, you’ll end getting the position of the first one with the same number of points, but not necessarily the one you’re looking, so you end with this kinds of tables:
Pos NickName Points
--- --------- ------
1 The best 1000
...
35 Roger 545
36 Paul 550
37 Anna 550
Not showing the actual player on the correct place, or even worse…
Pos NickName Points
--- --------- ------
1 The best 1000
...
35 Roger 545
36 Paul 550
37 Lucy 550
Not showing the player at all!
To make things worse the order is not always the same, as you have some players with the same number of points and the DB just seems to not return always the same order, so you need to introduce another element for sorting (the name, for example) to ensure that the ranking is not dancing just because the players have the exact number of points.
So.. at the end I needed to make a sort using two parameters (points, then name), get the complete list, get the index of the player and do the things the boring way.. Actually it’s working fine and nice, but I wonder if anyone knows a way of getting that avoiding the need of getting the complete list…
Perhaps you could add some extra data to handle out the sorting issue? Let the one who gains specific score to be the first within that score by default. It would take an extra field and logic to handle this, though.
You can use this information to tweak your “players before Anna” calculation.
Yes, that is actually what is happening. I have simplified the problem on the post, there are more fields, so you can sort by the number of points and other fields…
In Django you’ve just to sort this way (very related to how you do it on SQL). First by points (higher the better), then by “points date” (lower the better)
sorted_list = Score.objects.all().order_by(‘-points’, ‘points_date’)
But you’re still have to get all the sorted list and get the position of a particular player searching through all the list…
Okay. I think I understand the problem a bit better now. I think it boils down to the fact that there needs to be some neat way to store the adjacency data using a doubly linked list.
Some pseudocode to illustrate the idea (imaginary API):
person = Score.find(‘Anna’)
previous_person = person.prev
next_person = person.next
You should be able to access the score and name via attributes. There are some special cases (ie. first/last) where prev/next may be None.
I’m not sure how this would work out in Django. At least the API would be simple and you could hide the ugly details there. 😛
Maybe I’m missing something, but it seems to me if you pre-compute the ‘Pos’ field (or something like it) on insert, update, or delete, then your problem becomes trivial. On the other hand, you could end up updating the entire table on a single-row insert…depends on how often you insert into the table vs. read from it, and how big the data is.
Wouldn’t something like Score.objects.filter(Q(points__lt=points) | (Q(points=points) & Q(name__lt=name))).count() work?
I don’t know if you can do it with the django ORM, but with raw SQL you can use the count method with someting like
select count(*) from score where (points,name) < (545
,'Anna')
This works in Postgresql.
It's just using in the where clause the fields you are using for sorting all combined.
In some other databases where I cannot use this kind of grouping (like a tuple) I use the || (concatenation) operand.