Engineering A Personal Link Archive

The internet is big. Once in a while during our daily browsing, we come across a link that resonates with us in some way. So what do we do with that link?

Email the URL to ourselves? Text ourselves? Bookmarks? Or maybe use a third-party service like Pocket?

Yuck. For me, those solutions are adequate at best, terrible at worst. After a while, you start to have links saved everywhere and you don’t know where you saved that one that you need right now.

To solve this, I built my own archive. Now my links are searchable and organized, where I can easily find what I’m looking for. You can view my saved links at carbonemike.com/links. If you want to know how I did it, keep reading!

Building The Basics

The motivation to build my own tool is derived from the fact that I want to own this information. I don’t want to trust a third-party company, and I want to structure it exactly how I want. I’ve built a solution for this before as an old project called CarbonCollective, and that was great, but as a way of modernizing and consolidating, we’re starting fresh.

What better use for our Content API? (For those just tuning in, I built my own Content API for this website to store everything of mine: images, videos, posts, and more.)

First step is adding the table to our database:

CREATE TABLE links (
  id UUID PRIMARY KEY,
  title VARCHAR(255) NOT NULL,
  url TEXT NOT NULL,
  image TEXT,
  description TEXT,
  note VARCHAR(1000),
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

With our table created, we can now add our endpoints that’ll allow us to communicate with the Content API. These endpoints enable us to create, read, update, and delete any link we want. Here is what an API endpoint looks like:

app.delete('/links/:linkId', protection, async (req, res) => {
   try {
       const { linkId } = req.params;
       const text = `DELETE FROM links WHERE id = $1;`;
       const values = [linkId];
       await client.query(text, values);
       return res.status(200).send({ success: true });
   } catch (err) {
       console.error(err);
       return res.status(500).send(err);
   }
});

That endpoint allows us to delete a link that we’ve saved.

Next, we need a frontend user interface to communicate with the API. To generate a fully-functional interface immediately, I can use a tool I built called Interweave. After writing a small configuration file that describes the shape of my data, I generate an interface that looks like this:

Interweave allows me to quickly create interfaces for all of my API endpoints.

Speeding Up The User Experience

Excellent. This all works perfectly, but we can make the UX even better for myself. Manually entering a title, description, and image for each link can get annoying. Sometimes I want to just save a link quickly and not spend too much time thinking about it.

To do this, we can build a more customized frontend that’ll allow us to auto-fill that information. How? Almost every website contains invisible meta information within the HTML that is stored in the <head> element. Usually, this information is used to communicate with search engines and other services details about the web page. Well, we can consider this tool “another service” and use that information to auto-fill our form.

So let’s write a script that’ll fetch the website in code:

const response = await fetch(url);
if (!response.ok) {
     return res.status(500).send({ title: “”, description: “”, image: “” });
}
const html = await response.text();

You can think of fetch as entering the website in a web browser’s URL bar. It makes a request and gets the information from the URL. If we run into an error, we send an empty object and indicate that we failed. It's not a huge deal when we fail, so the empty response is okay. We can fill out the rest by hand. If we succeed, we can get the HTML via response.text();.

When we try this, it works perfectly! However, the code won’t be able to run on the frontend because of Cross-Origin Resource Sharing (CORS) restrictions. Pretty much, a website can’t talk directly to other websites. So let’s instead run this logic on the server and add a new API endpoint for our website to talk to. We can send the URL we want to save, and our server will do the parsing.

That endpoint looks like this:

app.post('/meta', async (req, res) => {
   try {
       const { url } = req.body;
       const response = await fetch(url);
       …
   } catch (err) {
       console.error(err);
       // Don't really care if this fails
       return res.status(200).send({ title: '', description: '', image: '' });
   }
});

Now that we have our API, we can make our request and use the received HTML to create a virtual representation of the Document Object Model (DOM), which will allow us to easily search the HTML for the information we need.

Here we use the jsdom library to create our DOM on the server, then do some JavaScript parsing to get the information we need from the webpage:

const { window } = new JSDOM(html);
const doc = window.document;
const head = doc.getElementsByTagName('head')[0];
const title = head.querySelector(`meta[property='og:title']`)?.content || '';
const description = head.querySelector(`meta[name='description']`)?.content || '';
const image = head.querySelector(`meta[property='og:image']`)?.content || '';

Then if this succeeds, we can send back the information:

return res.status(200).send({ title, description, image });

Using this information, we can auto-fill our form. The UX is really nice, saving a link is so fast and easy. Here is a video of the experience:

Saving links is a breeze with the auto-fill feature.

And instantly, we have our link saved for future reference!

Link saved in my Content API and displayed on the frontend.

You can check out all of my saved links right here. Enjoy!