

#SQL CHECKSUM VERIFICATION#
They don’t have to, but may vary, based on the SQL Server configuration.Today, I specifically checked what are the verification functions under SQL Server. Don’t compare CHECKSUM and BINARY_CHECKSUM hashes.Else byte information will be lost and hash collisions are likely to occur Don’t store a HASHBYTES result in an integer or bigint.Use BINARY_CHECKSUM for varbinary columns.Use CHECKSUM for nvarchar and varchar columns.However, CHECKSUM detects the String difference and returns a different hash value. “2Volvo Director 20” and “3Volvo Director 30” yields, due to its hashing-algorithm implementation, the same hash value. However, the BINARY_CHECKSUM reads different bits and thus, returns a different result. “Hello World!” and “hello world!” is, for a case insensitive SQL Server configuration, the same for the CHECKSUM function. Well, lets have a look at following hash results. BINARY_CHECKSUM interprests the provided data as a binary sequence.CHECKSUM interprets the provided data as an case insensitive String, depending on the SQL Servers collation and case configuration.The one and only BIG difference is their way of interpreting information. Furthermore, HASHBYTES is system independent, comprehensible and suitable for cross database/platform software. E.g.: to determin whether a row has been changed. HASHBYTES focuses on a single value, while CHECKSUM and BINARY_CHECKSUM are capable of hashing an entire row which may be handy in some cases. The benchmark hashes a 500KB file.įrom my point of view, the three hashing algorithm do serve a different purpose. But at the end of the day the overall result should reflect a rough insight. The benchmark will surely vary among different system specifications. Furthermore, HASHBYTES offers more than one implementation: MD2, MD4, MD5, SHA, SHA1, SHA2_256, SHA2_512.

In other words if you plan to target not only TSQL, consider using HASHBYTES right away.

Sadly, it only works for Strings but not for binaries. There is an reverse engineered version for CHECKSUM. Hence, if you require to calculate the hashes within your application, you are out of luck. Because those two functions are only available on the SQL Server, one won’t be able to pre-calculate hash values outside of the domain.
#SQL CHECKSUM CODE#
CompatibilityĬHECKSUM and BINARY_CHECKSUM is a Transact-SQL proprietary implementation and its source code is not available for the public. (Best case) HASHBYTES does offer the best unique representation and narrows down the possibility of intersections. Even if we manage to hash 4.294.967.296 unique data entries, we will run consequently into hash collision upon the next insert. (We consider using SHA2_512 with HASHBYTES)Įven thought CHECKSUM and BINARY_CHECKSUM do have no input bit-length limitation (more or less, See Limitations) they only do return 4 bytes, in order to represent a hash value. In order to analyse the uniqueness, lets have a closer look on the function’s return values. With that said, based on the chosen hash function, an additional limitation may apply: The bit-length of an argument. On the other hand, HASHBYTES only accepts one input column. As the taxonomy already reflects, CHECKSUM and BINARY_CHECKSUM do accept more than one column as an input. We need to differ between the argument count limitation, and the bit length limitation. The Taxonomy lacks “Limitation”, simply because this topic derives a certain complexity.
