The intelligent solution that turns to be unadequate

Recenlty I asked for advice on StackOverflow with a question related to sorting information on a Django application.

I copy the question here:

I’m trying to do something similar to this, in Django. This is part of the page of Anna:

Pos NickName      Points
--- ---------     ------
1   The best      1000
...
35  Roger         550
36  Anna          545
37  Paul          540

It’s a chart showing the scoring system, and it intends to show the first position, as well as the relative position of the presented player.

Showing the first one it’s easy, as it’s only making a query to the database and extracting the first one:

 best = Score.objects.all().order_by('-points')[0]

But I’m having problems getting the ones close to the presented player (Anna, in this case). I don’t want to go searching through the complete list, as the complete list of players can be quite long.

Maybe there’s a way to know the position a register occupies in an ordered list…

Any ideas on how to achieve it?

Well, I get with a clever solution that was, well, count the number of players with higher number of points than the actual player, so you’ll get the position. It’s definitively more elegant than rendering the complete list and then searching for the position of a particular player.

The problem is that it won’t work when you have several players with exactly the same number of  points. In that case, you’ll end getting the position of the first one with the same number of points, but not necessarily the one you’re looking, so you end with this kinds of tables:

Pos NickName      Points
--- ---------     ------
1   The best      1000
...
35  Roger         545
36  Paul          550
37  Anna          550

Not showing the actual player on the correct place, or even worse…

Pos NickName      Points
--- ---------     ------
1   The best      1000
...
35  Roger         545
36  Paul          550
37  Lucy          550

Not showing the player at all!

To make things worse the order is not always the same, as you have some players with the same number of points and the DB just seems to not return always the same order,  so you need to introduce another element for sorting (the name, for example) to ensure that the ranking is not dancing just because the players have the exact  number of points.

So.. at the end I needed to make a sort using two parameters (points, then name), get the complete list, get the index of the player and do the things the boring way.. Actually it’s working fine and nice, but I wonder if anyone knows a way of getting that avoiding the need of getting the complete list…

6 Comments on “The intelligent solution that turns to be unadequate

  1. Perhaps you could add some extra data to handle out the sorting issue? Let the one who gains specific score to be the first within that score by default. It would take an extra field and logic to handle this, though.

    You can use this information to tweak your “players before Anna” calculation.

    • Yes, that is actually what is happening. I have simplified the problem on the post, there are more fields, so you can sort by the number of points and other fields…

      In Django you’ve just to sort this way (very related to how you do it on SQL). First by points (higher the better), then by “points date” (lower the better)

      sorted_list = Score.objects.all().order_by(‘-points’, ‘points_date’)

      But you’re still have to get all the sorted list and get the position of a particular player searching through all the list…

  2. Okay. I think I understand the problem a bit better now. I think it boils down to the fact that there needs to be some neat way to store the adjacency data using a doubly linked list.

    Some pseudocode to illustrate the idea (imaginary API):
    person = Score.find(‘Anna’)
    previous_person = person.prev
    next_person = person.next

    You should be able to access the score and name via attributes. There are some special cases (ie. first/last) where prev/next may be None.

    I’m not sure how this would work out in Django. At least the API would be simple and you could hide the ugly details there. 😛

  3. Maybe I’m missing something, but it seems to me if you pre-compute the ‘Pos’ field (or something like it) on insert, update, or delete, then your problem becomes trivial. On the other hand, you could end up updating the entire table on a single-row insert…depends on how often you insert into the table vs. read from it, and how big the data is.

  4. Wouldn’t something like Score.objects.filter(Q(points__lt=points) | (Q(points=points) & Q(name__lt=name))).count() work?

  5. I don’t know if you can do it with the django ORM, but with raw SQL you can use the count method with someting like

    select count(*) from score where (points,name) < (545
    ,'Anna')

    This works in Postgresql.
    It's just using in the where clause the fields you are using for sorting all combined.
    In some other databases where I cannot use this kind of grouping (like a tuple) I use the || (concatenation) operand.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: