David Beroff (d4b) wrote,
David Beroff

Spider bait

Two quick bits before my main post:

My daughter posted to her Facebook Monday evening: "Update: there has been an explosion. Cause thus far unknown. But I'm having trouble studying when MY SCHOOL IS BLOWING UP." (We were at this very point on campus a year and a half ago.) There's strong speculation that the power outage and subsequent ammonia leak and generator explosion were a chain reaction to an extensive copper theft. Yes, Sara and friends are ok.

I've been posting about our earlier QR code T-shirt campaign (ironically, from that very same trip) on Warrior Forum, along with the two videos.

I woke up around 2am with an idea:

One potential downside of running a primarily-audio service is that it could be mostly invisible to web-crawling spiders. This is important with respect to search engine optimization (SEO) and especially advertising relevance. i.e., AboutTh.is will place ads on the Basic (free) opinions, and there will of course be more value if they are related and relevant to the opinion and underlying article. The problem is that the planned pages will actually have very little unique text:

I was already planning on fetching the first few sentences of the underlying article for the pages' meta tags and possibly invisible CSS spans, but I need to be careful to limit the amount, due to concerns with owner copyright, duplicate content, and possibly being perceived as trying to game the system. There's also the matter of training the scraper software to fetch the correct sentences, i.e., what is true content, versus what is navigation/advertising/whatever.

This morning, I thought: Wait... I have legit claim to the audio content itself. Why not auto-transcribe it and silently post that invisibly? Sure, mechanical transcription is far from perfect, but who cares; it won't show, (unless you view source), it's much better than nothing, and is statistically likely to capture at least some of the relevant keywords. Plus, we could always supplement this with outsourced human transcription for the opinions that prove to be more popular. So, this morning, I've been sniffing around for various solutions, from the free DictaNote to the top-of-the-line Dragon (of Dragon Naturally Speaking fame). Clearly, the better stuff will be pricey, but the nice thing here is that, since I'm generating all pages dynamically on-the-fly, nothing says I can't start ultra-cheap and then upgrade if/when there's financial merit in doing so.

Oh, and I'm also allowing optional tagging for each Opinion, at least by the original Member who recorded it (for now), so that should yield at least a few keywords, too.

I also recently noticed that YouTube has been (quietly?) auto-generating video transcriptions, most likely for very similar reasons; maybe that's where I got the idea in the first place. :-)
Tags: advertising, california, fec book, marketing, mtat, qr code, sara
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.