GUIDs

NOTE: There is a more complete article on eggheadcafe.com here.

Comments

  1. Anonymous1:49 PM

    The 33 character "representation" of GUIDs is NOT what is compared by SQL Server any more than the 'Monday March 1st, 1995 12:00:00AM' is what is compared when doing a DATETIME comparison.

    Rather, the value's 16-bytes are compared as raw binary. This is why NEWID() [aka randomized GUIDs] are so much worse than NEWSEQUENTIALID() [aka old-style MAC+TIME+TSEQ+SEQ]. The latter will be monotonically increasing within ranges.

    So, GUIDs are not bad, randomized GUIDs are.

    I will say that when you _CAN_ use an INT, BIGINT or segmented key [e.g. source-server TINYINT + IDENTITY INT], you should... this is when you can control the sources of the ids. But if you're dealing with clouds of servers, you'll need some globally unique value and that is what GUIDs are for.

    ReplyDelete
  2. Marc,
    I completely agree. The problem is, on my poll, only one respondent indicated the use of NEWSEQUENTIALID(). Most of us as developers are simply not aware of the raw size / fragmentation issues created by using UNIQUEIDENTIFIER as a primary key, because we are looking for convenience first and ignoring the details vis-a-vis performance.

    ReplyDelete
  3. As mentioned previously, the guid is not stored as a string,and is using numeric methods for comparison and indexing so that point is flat out wrong.

    The paging issue is real, which is why COMB guids were instituted.

    http://www.informit.com/articles/article.aspx?p=25862

    Lack of education how to use a feature does not mean a feature is bad, or should be avoided, especially if YOU do know how to use the feature correctly.

    that is why your "Im going to think twice..." comment makes no sense.

    ReplyDelete
  4. @Gaijin42,
    It really depends on how many rows you are going to be dealing with. Even the NEWSEQUENTIALID method has more fragmentation than an int identity PK, and fragmentation has performance implications.

    So for me to say that I need to think about whether I am going to use GUIDs as primar keys is, I think, completely logical.

    ReplyDelete
  5. Anonymous10:20 AM

    I've got some replication to consider.

    One way or the other, perhaps it's laziness -- I find that with storage being cheap, it was the clustering/indexing that was my main concern with GUIDs as a PK...

    ...but you can get away with that if you read another article from Kimberly Tripp...
    http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx

    ReplyDelete
  6. @Anonymous,
    I read Tripp's article, I even quoted her in this post. But she does not provide a solution with NEWSEQUENTIALID() to get away from the space issue. You're still dealing with 16 byte wide ID's.

    ReplyDelete
  7. Anonymous3:50 PM

    I only use GUIDs as PKs for UserIDs, for security reasons. I can also see using GUIDs for replication/merging databases, as you point out. But for pretty much everything else? Never. Nice to see some calculations and tests that verify my own reasons for eschewing GUIDs (I was always too lazy to test).

    ReplyDelete
  8. @jonf,
    One of the biggest problems is that if you've developed a database schema using GUIDs and you're in production with millions of rows of data, it's almost impossible to change at that point.

    ReplyDelete
  9. Only a developer would create a guid pk field in a database! DBA's nearly always always frown upon there use because of the fragmentation and size issues. Great post for those that do not know evils of guids in the db!

    ReplyDelete
  10. @Jeff,
    I think you just hit the nail right between the eyes on that score!

    ReplyDelete

Post a Comment

Popular posts from this blog

Some observations on Script Callbacks, "AJAX", "ATLAS" "AHAB" and where it's all going.

IE7 - Vista: "Internet Explorer has stopped Working"

FIREFOX / IE Word-Wrap, Word-Break, TABLES FIX