Transcribe Web3 lingo using Custom Vocabularies within AWS

Improve your AWS transcriptions using Custom Vocabularies.

I was transcribing a few audio recordings that contained a bunch of Web3 jargon, and I needed a better way to capture words like ERC20, ERC1155, ETH, etc. What worked for me was using AWS Translate with Custom Vocabularies.

Visit AWS and Create a Custom Vocabulary. Amazon has a nice video tutorial that I suggest watching. The video alongside my examples below should be enough to get you started.

Web3 Custom Vocabulary Examples

The two examples below demonstrate a Custom Vocabulary using a table. Although it's elegant that you can create a library using the International Phonetic Alphabet (IPA) or a Phrase, it's a bummer that you have to pick one standard. You cannot have a single document with both the IPA and Phrases, nor can you use two custom library docs during a translation job. So, in other words, only pick one approach.

Create a file named CustomTableWeb3Vocabulary.txt

Phrase	IPA	SoundsLike	DisplayAs
cee-fi			CeFi
de-fi			DeFi
d-i-y			DIY
e-r-c			ERC
e-r-c-twenty			ERC20
e-r-c-seven-twenty-one			ERC721
e-r-c-eleven-fifty-five			ERC1155
e-r-c-one-one-five-five			ERC1155
i-c-o		eye-see-o	ICO
t-l-d-r			TLDR
t-p-s			TPS
b-t-c			BTC
solana		so-la-na	Solana
e-v-m		EVM
k-y-c			KYC
t-v-l		TVL
e-n-s			ENS
off-chain			off-chain
on-chain			on-chain
on-ramp			onramp
a-w-s			AWS
on-board			onboard
web-three			Web3
layer-one			L-1
layer-twp			L-2
layer-three			L-3
drop			drop
meta-mask			metamask
dao			DAO
n-f-t			NFT
token			token
web-two			Web2
Alts			alts
back-end			backend
front-end			frontend
contract			contract
ETH		ee-th	ETH
fifty-one-percent			51 percent

Create a file named CustomTableWeb3Vocabulary-IPA.txt.

Phrase	IPA	SoundsLike	DisplayAs
git	ɡ ɪ t		GIT
gas	ɡ æ s		gas
hodl	h oʊ d l		HODL
mainnet	m eɪ n n ɛ t		mainnet
testnet	t ɛ s t n ɛ t		testnet
fomo	f oʊ m oʊ		FOMO
github	ɡ ɪ t h ə b		Github
node	n oʊ d		node
tailwind	t eɪ l w ɪ n d		TailwindCSS
gwei	g w eɪ		GWEI