GUIDs
- Get link
- X
- Other Apps
NOTE: There is a more complete article on eggheadcafe.com here.
Peter has been active on the C# and other MS newsgroups, has had samples at gotdotnet.com (now codegallery.com) that have been downloaded over 56,000 times, and has written numerous articles for eggheadcafe.com, for which he is a co-founder, and MSDNAA sites. He has also been a technical reviewer on books from Prentice Hall, Manning and other publishers. Peter was a Microsoft MVP for 10 years.
Peter was born and raised in New York City, and after a stint as a jazz musician playing contrabass with the Robert Hunt Trio, migrated to sunny Florida, where he was hired as a rookie Financial Consultant at Merrill Lynch in Orlando, specializing in Corporate Services. It was at Merrill that Peter realized computers and technology were his forte. He spent his last few years at Merrill developing real – time computerized trading and and stock charting program using advanced technical analysis indicators, some of which are his own inventions.
In 1997, he moved on to architect and serve as lead developer on a number of innovative web-based and middleware solutions for the healthcare and financial services industry.
- Get link
- X
- Other Apps
The 33 character "representation" of GUIDs is NOT what is compared by SQL Server any more than the 'Monday March 1st, 1995 12:00:00AM' is what is compared when doing a DATETIME comparison.
ReplyDeleteRather, the value's 16-bytes are compared as raw binary. This is why NEWID() [aka randomized GUIDs] are so much worse than NEWSEQUENTIALID() [aka old-style MAC+TIME+TSEQ+SEQ]. The latter will be monotonically increasing within ranges.
So, GUIDs are not bad, randomized GUIDs are.
I will say that when you _CAN_ use an INT, BIGINT or segmented key [e.g. source-server TINYINT + IDENTITY INT], you should... this is when you can control the sources of the ids. But if you're dealing with clouds of servers, you'll need some globally unique value and that is what GUIDs are for.
Marc,
ReplyDeleteI completely agree. The problem is, on my poll, only one respondent indicated the use of NEWSEQUENTIALID(). Most of us as developers are simply not aware of the raw size / fragmentation issues created by using UNIQUEIDENTIFIER as a primary key, because we are looking for convenience first and ignoring the details vis-a-vis performance.
As mentioned previously, the guid is not stored as a string,and is using numeric methods for comparison and indexing so that point is flat out wrong.
ReplyDeleteThe paging issue is real, which is why COMB guids were instituted.
http://www.informit.com/articles/article.aspx?p=25862
Lack of education how to use a feature does not mean a feature is bad, or should be avoided, especially if YOU do know how to use the feature correctly.
that is why your "Im going to think twice..." comment makes no sense.
@Gaijin42,
ReplyDeleteIt really depends on how many rows you are going to be dealing with. Even the NEWSEQUENTIALID method has more fragmentation than an int identity PK, and fragmentation has performance implications.
So for me to say that I need to think about whether I am going to use GUIDs as primar keys is, I think, completely logical.
I've got some replication to consider.
ReplyDeleteOne way or the other, perhaps it's laziness -- I find that with storage being cheap, it was the clustering/indexing that was my main concern with GUIDs as a PK...
...but you can get away with that if you read another article from Kimberly Tripp...
http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx
@Anonymous,
ReplyDeleteI read Tripp's article, I even quoted her in this post. But she does not provide a solution with NEWSEQUENTIALID() to get away from the space issue. You're still dealing with 16 byte wide ID's.
I only use GUIDs as PKs for UserIDs, for security reasons. I can also see using GUIDs for replication/merging databases, as you point out. But for pretty much everything else? Never. Nice to see some calculations and tests that verify my own reasons for eschewing GUIDs (I was always too lazy to test).
ReplyDelete@jonf,
ReplyDeleteOne of the biggest problems is that if you've developed a database schema using GUIDs and you're in production with millions of rows of data, it's almost impossible to change at that point.
Only a developer would create a guid pk field in a database! DBA's nearly always always frown upon there use because of the fragmentation and size issues. Great post for those that do not know evils of guids in the db!
ReplyDelete@Jeff,
ReplyDeleteI think you just hit the nail right between the eyes on that score!