Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > Miscellaneous > Lounge

Notices

Reply
 
Thread Tools Search this Thread
Old 05-21-2004, 10:34 AM   #1
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
Google PageRank Checksum Algorithm

Update 2: The working PHP source is now available here!

Update: Added the missing switch table.


You probably know about Google's PageRank. PageRank is Google's indicator for the general importance of a page. To display PageRanks, you need Google's toolbar (browser plug-in for IE).

Well, there is also a way to display PageRanks without the toolbar. You can request directly the PageRank of domain.com with the following url (without line breaks):

http://www.google.com/search?
client=navclient-auto&ch=0123456789&
features=Rank&
q=info:http://www.domain.com/

The key is the parameter "ch", which transfers a checksum for the URL to Google, whereby this checksum can only change when the Toolbar version is updated by Google. This checksum is also not publicly known.

I was able to determine the underlying algorithm for calculating the checksum in Google's Toolbar 2.0.111:

Code:
GOOGLECHECK	proc near

var_8		= dword	ptr -8
var_4		= dword	ptr -4
url_offset	= dword	ptr  8
url_length	= dword	ptr  0Ch
magic_dword	= dword	ptr  10h

		push	ebp
		mov	ebp, esp
		push	ecx
		push	ecx
		mov	eax, [ebp+url_length]
		cmp	eax, 0Ch
		push	ebx
		push	esi
		mov	esi, [ebp+magic_dword] ; = 0xE6359A60
		push	edi
		mov	edi, 9E3779B9h	; derived from the golden number, hi TEA ;)
		mov	ebx, edi
		mov	[ebp+var_4], eax
		jb	jump_1
		push	0Ch
		pop	ecx
		xor	edx, edx
		div	ecx
		mov	ecx, [ebp+url_offset]
		mov	[ebp+var_8], eax

loop_1:
		movzx	eax, byte ptr [ecx+7]
		movzx	edx, byte ptr [ecx+6]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+5]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+4]
		add	edx, edi
		shl	eax, 8
		lea	edi, [edx+eax]
		movzx	eax, byte ptr [ecx+0Bh]
		movzx	edx, byte ptr [ecx+0Ah]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+9]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+8]
		add	edx, esi
		shl	eax, 8
		lea	esi, [edx+eax]
		movzx	edx, byte ptr [ecx+3]
		movzx	eax, byte ptr [ecx+2]
		shl	edx, 8
		add	edx, eax
		movzx	eax, byte ptr [ecx+1]
		shl	edx, 8
		add	edx, eax
		movzx	eax, byte ptr [ecx]
		shl	edx, 8
		add	edx, eax
		sub	edx, edi
		sub	edx, esi
		mov	eax, esi
		shr	eax, 0Dh
		add	edx, ebx
		xor	edx, eax
		sub	edi, edx
		sub	edi, esi
		mov	eax, edx
		shl	eax, 8
		xor	edi, eax
		sub	esi, edi
		sub	esi, edx
		mov	eax, edi
		shr	eax, 0Dh
		xor	esi, eax
		sub	edx, edi
		sub	edx, esi
		mov	eax, esi
		shr	eax, 0Ch
		xor	edx, eax
		sub	edi, edx
		sub	edi, esi
		mov	eax, edx
		shl	eax, 10h
		xor	edi, eax
		sub	esi, edi
		sub	[ebp+var_4], 0Ch
		sub	esi, edx
		mov	eax, edi
		shr	eax, 5
		xor	esi, eax
		sub	edx, edi
		mov	eax, esi
		shr	eax, 3
		sub	edx, esi
		xor	edx, eax
		mov	ebx, edx
		sub	edi, ebx
		sub	edi, esi
		mov	eax, ebx
		shl	eax, 0Ah
		xor	edi, eax
		sub	esi, edi
		mov	eax, edi
		sub	esi, ebx
		shr	eax, 0Fh
		xor	esi, eax
		add	ecx, 0Ch
		dec	[ebp+var_8]
		jnz	loop_1
		jmp	short jump_2

jump_1:
		mov	ecx, [ebp+url_offset]

jump_2:
		add	esi, [ebp+url_length]
		mov	eax, [ebp+var_4]
		dec	eax
		cmp	eax, 0Ah	; switch 11 cases
		ja	defaultswitch	; default
		jmp	ds:off_100307EA[eax*4] ; switch	jump

switch_10:
		movzx	eax, byte ptr [ecx+0Ah]	; case 0xA
		shl	eax, 18h
		add	esi, eax

switch_9:
		movzx	eax, byte ptr [ecx+9] ;	case 0x9
		shl	eax, 10h
		add	esi, eax

switch_8:
		movzx	eax, byte ptr [ecx+8] ;	case 0x8
		shl	eax, 8
		add	esi, eax

switch_7:
		movzx	eax, byte ptr [ecx+7] ;	case 0x7
		movzx	edx, byte ptr [ecx+6]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+5]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+4]
		shl	eax, 8
		add	edx, edi
		lea	edi, [edx+eax]
		jmp	short switch_3	; case 0x3

switch_6:
		movzx	eax, byte ptr [ecx+6] ;	case 0x6
		shl	eax, 10h
		add	edi, eax

switch_5:
		movzx	eax, byte ptr [ecx+5] ;	case 0x5
		shl	eax, 8
		add	edi, eax

switch_4:
		movzx	eax, byte ptr [ecx+4] ;	case 0x4
		add	edi, eax

switch_3:
		movzx	eax, byte ptr [ecx+3] ;	case 0x3
		movzx	edx, byte ptr [ecx+2]
		shl	eax, 8
		add	eax, edx
		movzx	edx, byte ptr [ecx+1]
		movzx	ecx, byte ptr [ecx]
		shl	eax, 8
		add	eax, edx
		shl	eax, 8
		add	ecx, ebx
		lea	ebx, [ecx+eax]
		jmp	short defaultswitch ; default

switch_2:
		movzx	eax, byte ptr [ecx+2] ;	case 0x2
		shl	eax, 10h
		add	ebx, eax

switch_1:
		movzx	eax, byte ptr [ecx+1] ;	case 0x1
		shl	eax, 8
		add	ebx, eax

switch_0:
		movzx	eax, byte ptr [ecx] ; case 0x0
		add	ebx, eax

defaultswitch:
		sub	ebx, edi	; default
		sub	ebx, esi
		mov	eax, esi
		shr	eax, 0Dh
		xor	ebx, eax
		sub	edi, ebx
		sub	edi, esi
		mov	eax, ebx
		shl	eax, 8
		xor	edi, eax
		sub	esi, edi
		sub	esi, ebx
		mov	eax, edi
		shr	eax, 0Dh
		xor	esi, eax
		sub	ebx, edi
		sub	ebx, esi
		mov	eax, esi
		shr	eax, 0Ch
		xor	ebx, eax
		sub	edi, ebx
		sub	edi, esi
		mov	eax, ebx
		shl	eax, 10h
		xor	edi, eax
		sub	esi, edi
		mov	eax, edi
		sub	esi, ebx
		shr	eax, 5
		xor	esi, eax
		sub	ebx, edi
		mov	eax, esi
		mov	ecx, eax
		sub	ebx, eax
		shr	ecx, 3
		xor	ebx, ecx
		sub	edi, ebx
		sub	edi, eax
		mov	ecx, ebx
		shl	ecx, 0Ah
		xor	edi, ecx
		sub	eax, edi
		sub	eax, ebx
		shr	edi, 0Fh
		xor	eax, edi
		pop	edi
		pop	esi
		pop	ebx
		leave
		retn
GOOGLECHECK	endp

; Switch table
off_100307EA	
		dd offset switch_0
		dd offset switch_1
		dd offset switch_2
		dd offset switch_3
		dd offset switch_4
		dd offset switch_5
		dd offset switch_6
		dd offset switch_7
		dd offset switch_8
		dd offset switch_9
		dd offset switch_10
At the end, eax holds the checksum in hex.
Now my question: Can anyone rewrite this code snippet into PHP?

Andy

Last edited by doctorow; 06-27-2004 at 09:57 AM.
doctorow is offline   Reply With Quote
Old 06-02-2004, 04:06 PM   #2
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
Somebody asked me the following, and I thought it better to show the answer to everyone:
Quote:
Hi,
I was looking at the checksum algorithm for Pagerank and I am trying to figure it out myself.

My question is how do you get the value for these variables?

var_8
var_4
url_offset
url_length
magic_dword
var_8 and var_4 are local stack variables. They are uninitialized at the beginning of the algorithm function.
url_offset is a pointer to the URL of the web site we want to check the rank for.
url_length is the length of the URL of the web site we want to check the rank for.
magic_dword is a static 32bit word being passed to the function. As shown in the code, it is 0xE6359A60.
doctorow is offline   Reply With Quote
Advert
Old 06-08-2004, 12:07 PM   #3
Unregistered
Nameless Being
 
Would it be possible to use that assembly code to make a program in C and assembly to generate the checksums?
  Reply With Quote
Old 06-08-2004, 02:22 PM   #4
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
Sure it would be. You can actually compile the assembler stub above as an .obj and then easily link it in any c application (I've been doing this already, actually).

But it would be much more interesting to have above's code in .php, allowing for various web applications!
doctorow is offline   Reply With Quote
Old 06-10-2004, 07:59 AM   #5
palmar
Junior Member
palmar began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2004
Undefined symbol : off_100307EA

Hello

Assembler told me:
undefined symbol : off_100307EA

Can you tell what is that identifier mean?
And how it should be declared, defined, set?

Thanks.
palmar is offline   Reply With Quote
Advert
Old 06-12-2004, 02:56 PM   #6
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
off_100307ea is a jump table that I forgot to include. I will add the fixed code tomorrow.

Still, noone capable of converting ASM -> PHP?
doctorow is offline   Reply With Quote
Old 06-13-2004, 02:12 PM   #7
seo-junior
Junior Member
seo-junior began at the beginning.
 
Posts: 1
Karma: 12
Join Date: Jun 2004
Quote:
Originally Posted by doctorow
Now my question: Can anyone rewrite this code snippet into PHP?

Still, noone capable of converting ASM -> PHP?

Somebody out there has released a PHP script for calculating Google PageRank checksums needed for the parameter "ch" within all queries of PageRanks, even without the Google Toolbar.

They say it has been tested with over 1.5 million different domains.

You can test this script at Google PageRank checksum calculation. There is some additional information available about this PHP script.

I've made several requests for my own domains, the computed checksums are all working properly.

Maybe this script is what you need.

Many regards!
seo-junior is offline   Reply With Quote
Old 06-17-2004, 10:05 PM   #8
Unregistered
Nameless Being
 
>off_100307ea is a jump table that I forgot to include. I will add the fixed code tomorrow.

Could you post the table? I'll try to -> PHP it. (=
  Reply With Quote
Old 06-17-2004, 10:16 PM   #9
Unregistered
Nameless Being
 
Also could you give me the memory address for that code you posted?
  Reply With Quote
Old 06-22-2004, 06:51 AM   #10
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
I added the missing jump switch table (off_100307EA).
doctorow is offline   Reply With Quote
Old 06-23-2004, 08:10 PM   #11
Unregistered
Nameless Being
 
Exclamation

Is anyone seriously working towards porting this to PHP? How far have you got if you are? I am still documenting the ASM code, almost finished though. Might have a working PHP implmentation finished this weekend. If anyone would like to dicuss working together on porting this to PHP further contact me at alex.stapleton@gmail.com please so we can swap information.
  Reply With Quote
Old 06-24-2004, 03:30 AM   #12
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
Alex (I assume your name is Alex from your email),
please post and share your info here in this thread instead of e-mailing each other. Someone already messaged me and informed me that he has a working Perl code based on the code we posted here. I hope that he is also going to share it with us.
doctorow is offline   Reply With Quote
Old 06-24-2004, 06:37 AM   #13
Unregistered
Nameless Being
 
OK I will post what I know so far later today. I'm currently writing a PHP implementation of what appears to be a pretty simple hashing algorithm, I should be able to have a fully functioning PHP implementation working by Saturday or Sunday.
  Reply With Quote
Old 06-24-2004, 07:53 AM   #14
Unregistered
Nameless Being
 
My work is availible here
http://meese.ath.cx/google/
I am working on those files directly so they will be updated as I make changes. The phps might be a bit behind the php but I will try and update it frequently.

These are pretty rough working notes for it, so don't expect much for a while, they may prove useful for someone else though.
  Reply With Quote
Old 06-24-2004, 10:26 AM   #15
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
Thanks for the update! Cool idea to use $variables for the registers (eax, ...).
doctorow is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Seriously thoughtful Dekker's Algorithm help. Catire Lounge 13 03-19-2010 11:03 AM
Bulk Pagerank Checker Script? SNaRe Lounge 2 10-22-2006 05:36 PM
Google Toolbar Pagerank Checksum Revealed! Alexander Turcic Lounge 5 02-17-2006 09:09 AM
Google Checksum CH calculator cyberax Lounge 2 08-17-2004 10:37 PM


All times are GMT -4. The time now is 07:46 PM.


MobileRead.com is a privately owned, operated and funded community.