GUIDs

By peterbromberg August 31, 2010

NOTE: There is a more complete article on eggheadcafe.com here.

Peter has been active on the C# and other MS newsgroups, has had samples at gotdotnet.com (now codegallery.com) that have been downloaded over 56,000 times, and has written numerous articles for eggheadcafe.com, for which he is a co-founder, and MSDNAA sites. He has also been a technical reviewer on books from Prentice Hall, Manning and other publishers. Peter was a Microsoft MVP for 10 years. Peter was born and raised in New York City, and after a stint as a jazz musician playing contrabass with the Robert Hunt Trio, migrated to sunny Florida, where he was hired as a rookie Financial Consultant at Merrill Lynch in Orlando, specializing in Corporate Services. It was at Merrill that Peter realized computers and technology were his forte. He spent his last few years at Merrill developing real – time computerized trading and and stock charting program using advanced technical analysis indicators, some of which are his own inventions. In 1997, he moved on to architect and serve as lead developer on a number of innovative web-based and middleware solutions for the healthcare and financial services industry.

Comments

Anonymous1:49 PM
The 33 character "representation" of GUIDs is NOT what is compared by SQL Server any more than the 'Monday March 1st, 1995 12:00:00AM' is what is compared when doing a DATETIME comparison.

Rather, the value's 16-bytes are compared as raw binary. This is why NEWID() [aka randomized GUIDs] are so much worse than NEWSEQUENTIALID() [aka old-style MAC+TIME+TSEQ+SEQ]. The latter will be monotonically increasing within ranges.

So, GUIDs are not bad, randomized GUIDs are.

I will say that when you _CAN_ use an INT, BIGINT or segmented key [e.g. source-server TINYINT + IDENTITY INT], you should... this is when you can control the sources of the ids. But if you're dealing with clouds of servers, you'll need some globally unique value and that is what GUIDs are for.
ReplyDelete
Replies
peterbromberg2:05 PM
Marc,
I completely agree. The problem is, on my poll, only one respondent indicated the use of NEWSEQUENTIALID(). Most of us as developers are simply not aware of the raw size / fragmentation issues created by using UNIQUEIDENTIFIER as a primary key, because we are looking for convenience first and ignoring the details vis-a-vis performance.
ReplyDelete
Replies
Gaijin423:56 PM
As mentioned previously, the guid is not stored as a string,and is using numeric methods for comparison and indexing so that point is flat out wrong.

The paging issue is real, which is why COMB guids were instituted.

http://www.informit.com/articles/article.aspx?p=25862

Lack of education how to use a feature does not mean a feature is bad, or should be avoided, especially if YOU do know how to use the feature correctly.

that is why your "Im going to think twice..." comment makes no sense.
ReplyDelete
Replies
peterbromberg4:09 PM
@Gaijin42,
It really depends on how many rows you are going to be dealing with. Even the NEWSEQUENTIALID method has more fragmentation than an int identity PK, and fragmentation has performance implications.

So for me to say that I need to think about whether I am going to use GUIDs as primar keys is, I think, completely logical.
ReplyDelete
Replies
Anonymous10:20 AM
I've got some replication to consider.

One way or the other, perhaps it's laziness -- I find that with storage being cheap, it was the clustering/indexing that was my main concern with GUIDs as a PK...

...but you can get away with that if you read another article from Kimberly Tripp...
http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx
ReplyDelete
Replies
peterbromberg10:28 AM
@Anonymous,
I read Tripp's article, I even quoted her in this post. But she does not provide a solution with NEWSEQUENTIALID() to get away from the space issue. You're still dealing with 16 byte wide ID's.
ReplyDelete
Replies
Anonymous3:50 PM
I only use GUIDs as PKs for UserIDs, for security reasons. I can also see using GUIDs for replication/merging databases, as you point out. But for pretty much everything else? Never. Nice to see some calculations and tests that verify my own reasons for eschewing GUIDs (I was always too lazy to test).
ReplyDelete
Replies
peterbromberg3:53 PM
@jonf,
One of the biggest problems is that if you've developed a database schema using GUIDs and you're in production with millions of rows of data, it's almost impossible to change at that point.
ReplyDelete
Replies
jeff5:07 PM
Only a developer would create a guid pk field in a database! DBA's nearly always always frown upon there use because of the fragmentation and size issues. Great post for those that do not know evils of guids in the db!
ReplyDelete
Replies
peterbromberg5:09 PM
@Jeff,
I think you just hit the nail right between the eyes on that score!
ReplyDelete
Replies

Search This Blog

Peter Bromberg's UnBlog

GUIDs

Comments

Post a Comment

Popular posts from this blog

IE7 - Vista: "Internet Explorer has stopped Working"

FIREFOX / IE Word-Wrap, Word-Break, TABLES FIX

KB929729 Windows Update Failure - An Easy FIX

FIX: Requested Registry Access is not allowed (Visual Studio 2008)

OMG, Silverlight! Asynchronous is Evil! (or, Call me back when you got it)