I'd hash strings using FNV-hashing, that is XORing the hash by each byte/char of the key & multiplying by a carefully chosen prime.
None of my research states in great detail why these particular primes are chosen but for a 32bit hash 2^24 + 2^8 + 147 = 16,777,619 is IETF-recommended!
For small strings I'd use an internal record mapping from hashes to deduplicated strings, so we can use the string's pointer as it's hash. These strings would be length-prefixed.
As for longer strings...
5/6!!
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.