05-21-2004, 10:34 AM | #1 |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
Google PageRank Checksum Algorithm
Update 2: The working PHP source is now available here!
Update: Added the missing switch table. You probably know about Google's PageRank. PageRank is Google's indicator for the general importance of a page. To display PageRanks, you need Google's toolbar (browser plug-in for IE). Well, there is also a way to display PageRanks without the toolbar. You can request directly the PageRank of domain.com with the following url (without line breaks): http://www.google.com/search? client=navclient-auto&ch=0123456789& features=Rank& q=info:http://www.domain.com/ The key is the parameter "ch", which transfers a checksum for the URL to Google, whereby this checksum can only change when the Toolbar version is updated by Google. This checksum is also not publicly known. I was able to determine the underlying algorithm for calculating the checksum in Google's Toolbar 2.0.111: Code:
GOOGLECHECK proc near var_8 = dword ptr -8 var_4 = dword ptr -4 url_offset = dword ptr 8 url_length = dword ptr 0Ch magic_dword = dword ptr 10h push ebp mov ebp, esp push ecx push ecx mov eax, [ebp+url_length] cmp eax, 0Ch push ebx push esi mov esi, [ebp+magic_dword] ; = 0xE6359A60 push edi mov edi, 9E3779B9h ; derived from the golden number, hi TEA ;) mov ebx, edi mov [ebp+var_4], eax jb jump_1 push 0Ch pop ecx xor edx, edx div ecx mov ecx, [ebp+url_offset] mov [ebp+var_8], eax loop_1: movzx eax, byte ptr [ecx+7] movzx edx, byte ptr [ecx+6] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+5] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+4] add edx, edi shl eax, 8 lea edi, [edx+eax] movzx eax, byte ptr [ecx+0Bh] movzx edx, byte ptr [ecx+0Ah] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+9] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+8] add edx, esi shl eax, 8 lea esi, [edx+eax] movzx edx, byte ptr [ecx+3] movzx eax, byte ptr [ecx+2] shl edx, 8 add edx, eax movzx eax, byte ptr [ecx+1] shl edx, 8 add edx, eax movzx eax, byte ptr [ecx] shl edx, 8 add edx, eax sub edx, edi sub edx, esi mov eax, esi shr eax, 0Dh add edx, ebx xor edx, eax sub edi, edx sub edi, esi mov eax, edx shl eax, 8 xor edi, eax sub esi, edi sub esi, edx mov eax, edi shr eax, 0Dh xor esi, eax sub edx, edi sub edx, esi mov eax, esi shr eax, 0Ch xor edx, eax sub edi, edx sub edi, esi mov eax, edx shl eax, 10h xor edi, eax sub esi, edi sub [ebp+var_4], 0Ch sub esi, edx mov eax, edi shr eax, 5 xor esi, eax sub edx, edi mov eax, esi shr eax, 3 sub edx, esi xor edx, eax mov ebx, edx sub edi, ebx sub edi, esi mov eax, ebx shl eax, 0Ah xor edi, eax sub esi, edi mov eax, edi sub esi, ebx shr eax, 0Fh xor esi, eax add ecx, 0Ch dec [ebp+var_8] jnz loop_1 jmp short jump_2 jump_1: mov ecx, [ebp+url_offset] jump_2: add esi, [ebp+url_length] mov eax, [ebp+var_4] dec eax cmp eax, 0Ah ; switch 11 cases ja defaultswitch ; default jmp ds:off_100307EA[eax*4] ; switch jump switch_10: movzx eax, byte ptr [ecx+0Ah] ; case 0xA shl eax, 18h add esi, eax switch_9: movzx eax, byte ptr [ecx+9] ; case 0x9 shl eax, 10h add esi, eax switch_8: movzx eax, byte ptr [ecx+8] ; case 0x8 shl eax, 8 add esi, eax switch_7: movzx eax, byte ptr [ecx+7] ; case 0x7 movzx edx, byte ptr [ecx+6] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+5] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+4] shl eax, 8 add edx, edi lea edi, [edx+eax] jmp short switch_3 ; case 0x3 switch_6: movzx eax, byte ptr [ecx+6] ; case 0x6 shl eax, 10h add edi, eax switch_5: movzx eax, byte ptr [ecx+5] ; case 0x5 shl eax, 8 add edi, eax switch_4: movzx eax, byte ptr [ecx+4] ; case 0x4 add edi, eax switch_3: movzx eax, byte ptr [ecx+3] ; case 0x3 movzx edx, byte ptr [ecx+2] shl eax, 8 add eax, edx movzx edx, byte ptr [ecx+1] movzx ecx, byte ptr [ecx] shl eax, 8 add eax, edx shl eax, 8 add ecx, ebx lea ebx, [ecx+eax] jmp short defaultswitch ; default switch_2: movzx eax, byte ptr [ecx+2] ; case 0x2 shl eax, 10h add ebx, eax switch_1: movzx eax, byte ptr [ecx+1] ; case 0x1 shl eax, 8 add ebx, eax switch_0: movzx eax, byte ptr [ecx] ; case 0x0 add ebx, eax defaultswitch: sub ebx, edi ; default sub ebx, esi mov eax, esi shr eax, 0Dh xor ebx, eax sub edi, ebx sub edi, esi mov eax, ebx shl eax, 8 xor edi, eax sub esi, edi sub esi, ebx mov eax, edi shr eax, 0Dh xor esi, eax sub ebx, edi sub ebx, esi mov eax, esi shr eax, 0Ch xor ebx, eax sub edi, ebx sub edi, esi mov eax, ebx shl eax, 10h xor edi, eax sub esi, edi mov eax, edi sub esi, ebx shr eax, 5 xor esi, eax sub ebx, edi mov eax, esi mov ecx, eax sub ebx, eax shr ecx, 3 xor ebx, ecx sub edi, ebx sub edi, eax mov ecx, ebx shl ecx, 0Ah xor edi, ecx sub eax, edi sub eax, ebx shr edi, 0Fh xor eax, edi pop edi pop esi pop ebx leave retn GOOGLECHECK endp ; Switch table off_100307EA dd offset switch_0 dd offset switch_1 dd offset switch_2 dd offset switch_3 dd offset switch_4 dd offset switch_5 dd offset switch_6 dd offset switch_7 dd offset switch_8 dd offset switch_9 dd offset switch_10 Now my question: Can anyone rewrite this code snippet into PHP? Andy Last edited by doctorow; 06-27-2004 at 09:57 AM. |
06-02-2004, 04:06 PM | #2 | |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
Somebody asked me the following, and I thought it better to show the answer to everyone:
Quote:
url_offset is a pointer to the URL of the web site we want to check the rank for. url_length is the length of the URL of the web site we want to check the rank for. magic_dword is a static 32bit word being passed to the function. As shown in the code, it is 0xE6359A60. |
|
Advert | |
|
06-08-2004, 12:07 PM | #3 |
Nameless Being
|
Would it be possible to use that assembly code to make a program in C and assembly to generate the checksums?
|
06-08-2004, 02:22 PM | #4 |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
Sure it would be. You can actually compile the assembler stub above as an .obj and then easily link it in any c application (I've been doing this already, actually).
But it would be much more interesting to have above's code in .php, allowing for various web applications! |
06-10-2004, 07:59 AM | #5 |
Junior Member
Posts: 1
Karma: 10
Join Date: Jun 2004
|
Undefined symbol : off_100307EA
Hello
Assembler told me: undefined symbol : off_100307EA Can you tell what is that identifier mean? And how it should be declared, defined, set? Thanks. |
Advert | |
|
06-12-2004, 02:56 PM | #6 |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
off_100307ea is a jump table that I forgot to include. I will add the fixed code tomorrow.
Still, noone capable of converting ASM -> PHP? |
06-13-2004, 02:12 PM | #7 | |
Junior Member
Posts: 1
Karma: 12
Join Date: Jun 2004
|
Quote:
Somebody out there has released a PHP script for calculating Google PageRank checksums needed for the parameter "ch" within all queries of PageRanks, even without the Google Toolbar. They say it has been tested with over 1.5 million different domains. You can test this script at Google PageRank checksum calculation. There is some additional information available about this PHP script. I've made several requests for my own domains, the computed checksums are all working properly. Maybe this script is what you need. Many regards! |
|
06-17-2004, 10:05 PM | #8 |
Nameless Being
|
>off_100307ea is a jump table that I forgot to include. I will add the fixed code tomorrow.
Could you post the table? I'll try to -> PHP it. (= |
06-17-2004, 10:16 PM | #9 |
Nameless Being
|
Also could you give me the memory address for that code you posted?
|
06-22-2004, 06:51 AM | #10 |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
I added the missing jump switch table (off_100307EA).
|
06-23-2004, 08:10 PM | #11 |
Nameless Being
|
Is anyone seriously working towards porting this to PHP? How far have you got if you are? I am still documenting the ASM code, almost finished though. Might have a working PHP implmentation finished this weekend. If anyone would like to dicuss working together on porting this to PHP further contact me at alex.stapleton@gmail.com please so we can swap information.
|
06-24-2004, 03:30 AM | #12 |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
Alex (I assume your name is Alex from your email),
please post and share your info here in this thread instead of e-mailing each other. Someone already messaged me and informed me that he has a working Perl code based on the code we posted here. I hope that he is also going to share it with us. |
06-24-2004, 06:37 AM | #13 |
Nameless Being
|
OK I will post what I know so far later today. I'm currently writing a PHP implementation of what appears to be a pretty simple hashing algorithm, I should be able to have a fully functioning PHP implementation working by Saturday or Sunday.
|
06-24-2004, 07:53 AM | #14 |
Nameless Being
|
My work is availible here
http://meese.ath.cx/google/ I am working on those files directly so they will be updated as I make changes. The phps might be a bit behind the php but I will try and update it frequently. These are pretty rough working notes for it, so don't expect much for a while, they may prove useful for someone else though. |
06-24-2004, 10:26 AM | #15 |
Guru
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
|
Thanks for the update! Cool idea to use $variables for the registers (eax, ...).
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Seriously thoughtful Dekker's Algorithm help. | Catire | Lounge | 13 | 03-19-2010 11:03 AM |
Bulk Pagerank Checker Script? | SNaRe | Lounge | 2 | 10-22-2006 05:36 PM |
Google Toolbar Pagerank Checksum Revealed! | Alexander Turcic | Lounge | 5 | 02-17-2006 09:09 AM |
Google Checksum CH calculator | cyberax | Lounge | 2 | 08-17-2004 10:37 PM |