Solo Web Startup

A solo entrepreneur's 1st startup

  • @Refynr
  • My startup: Refynr.com
  • Me on LinkedIn
    • 22
      22 Jun 2011

      MySQL database design question: Trigger or not?

      • Edit
      • Delete
      • Tags
      • Autopost

      [UPDATE 3, on July 15, 2011]: I solved the mystery of mySQL MAX(ID) queries timing out, though I don't know the root cause. Ideas?

      I wrapped the query in question with a try/catch and sent the details to myself in an email with the UserID in the subject. I quickly saw a pattern that about 10 UserIDs kept showing up. I saw in the database that those accounts were for people that hadn't used Refynr for months, so I disabled their accounts and deleted their StreamItem data. Voila! The timeout errors are gone!

      But I have no idea why. Some corrupt data? Some huge IDs that the system couldn't handle? Ideas?

      [/UPDATE 3]

      [UPDATE 4]

      NOPE. That did NOT solve the problem. A couple of days later, presumably as the data grew again, the errors came back. I tried the trick from update 3 above, but new timeout errors kept coming up.

      Next, I upgraded from MySQL 5.1 to 5.5. It didn't help.

      Then, I even upgraded the server from 4G to 8G of RAM. That didn't help either.

      I know I have poor database design, but I think in the long run I'm going to need something more scalable anyways, so I'm working on switching this table to MongoDB. So far, so good. It hasn't been too painful so far.

      [/UPDATE 4]

      ---------------------------------------------------------------------------------------------------------------------------------

       

       

      Okay, here's the situation:

      1. I have a table called StreamItems that collects tweets/posts from Twitter and Facebook
      2. The IDs from Twitter and FB are large numbers like 73054697145118720 and 100000057279452_212728265405748
      3. The database server is dedicated for MySQL 5.1 Community, has 4G RAM, and is Quad Core
      4. I've tuned the MySQL memory settings to match (I'm pretty sure I've done a decent job here, but could be wrong)
      5. I get timeout errors at times when trying to get a MAX() or doing an ORDER BY on these IDs
      6. Now that there's over 5 million Items saved, it can't handle the sorting necessary, so we need a better solution

      My thought is to:

      1. Create a table called StreamItemMaxIDs that stores the max IDs for each Twitter/FB account
      2. Instead of scanning the whole StreamItems table for MAX or sorted IDs each time I need it, simply store the MAX ID in this separate table so that the query is fast every time

      [I know that a NoSQL solution *might* be better (especially long-term), but am looking for a mySQL solution at the moment.]

      So, should I keep the logic of updating the MAX IDs in the application-level code, or write a mySQL trigger? (Keep in mind I'm not a DBA)

      Let me know if you need more info.

      [UPDATE: here's the table schema]

      [UPDATE 2: here's the dynamic SQL that times out under load]

       

      • views
      • Tweet
    • 0
      29 Dec 2010

      Regex to find cfquery missing cfqueryparam

      • Edit
      • Delete
      • Tags
      • Autopost
      This is a regex within CFEclipse of ColdFusion Builder I've used in the past to clean up legacy code that's missing CFQueryParam's:

      \s+(where|set)\s+[a-z_\.]+\s+(like|\=)\s+'?%?#[a-z"_'\.\(\)\+\/]+#%?'?

      A CF-based solution is also http://qpscanner.riaforge.org/

      Aaron Longnion
      http://refynr.com

      • views
      • Tweet
    • 1
      4 Aug 2010

      Re-runnable MySQL release scripts

      • Edit
      • Delete
      • Tags
      • Autopost

      In a dev team, often times you manage database release scripts via "release***.SQL" files that are built up by various team members and saved in version control systems such as SVN or GIT. One problem is that when developer A makes some data or schema/design changes to the database, developer B & C don't necessarily know if it will cause duplicate records, or with schema changes it can cause errors if the scripts are run more than once. Here's a few concrete examples:

      1. For a Users table, developer A writes a sql script to insert some new records. Developer C updates to HEAD in SVN and runs the scripts from developer A. A few days later, Developer B writes some additional scripts to insert other new records. Developer C updates SVN again and wants to make sure his database is up-to-date, but he's smart and doesn't want to mess up anything with duplicate records, so he asks Developer A and B which scripts have already been run, etc. Developer A and B can't remember exactly, so Developer C does some SVN diffs and confirms which records have already been written to his local copy of the database. Now he knows he only has to run the new scripts from Developer B, and can skip the ones from Developer A.
      2. Developer A creates some new tables. Developer C runs the scripts on his local database. Developer B creates more new tables. When Developer C runs the scripts again, he gets errors about a database table already existing. He has to investigate a little to see what the error was all about, and if it's really a problem, or just an expected side-effect.

      Yuck! That's too much of a headache for team members to have to worry about. So, let's make some rules about release scripts for the database:

      1. There should never be errors. If there's an error, then there was a problem in the script and someone should fix that error. If not, then we should be able to safely assume the database is properly updated.
      2. You should be able to always run all the scripts in a release file without causing errors or duplicate rows of data. Write your scripts so that every time the scripts are updated, the next developer can run the entire file safely. This takes extra work up-front, but saves a lot of confusion and headaches over the long term.

      For creating new tables, MySQL provides native syntax: CREATE TABLE IF NOT EXISTS `Employees` (...

      For making sure you only add a column once, or only change a column name once, the code below includes stored procedures to handle it: AddColumnUnlessExists () and ChangeColumnUnlessExists().

      For making sure you only add a row of data once, MySQL also provides some native, though verbose, syntax. Here's the SQL to make it happen.

      Note that there's also version control tools for databases, but that may or may not work for your team and I am not familiar with those tools.... yet.

       

      • views
      • Tweet
    • Search

    • Sites I Like

      • Sean Corfield's blog
      • Filter Twitter & Facebook w/ Refynr
      • Raymond Camden's ColdFusion Blog
      • Aaron West's Blog
      • Ben Nadel's blog
      • A Smart Bear blog
      • IttyBiz blog
      • Charlie Arehart ColdFusion Consulting
    • Tags

      • refynr
      • ColdFusion
      • startup
      • Twitter
      • entrepreneur
      • social media
      • facebook
      • railo
      • Jobs
      • cloud computing
      • mysql
      • programming
      • social networks
      • legacy code
      • cfml
      • jquery
      • marketing
      • video
      • Developer
      • Holland
      • design
      • logo
      • rackspace
      • LinkedIn
      • Subversion
      • crowdsource
      • googleplus
      • linux
      • security
      • Peoplexs
      • SEO
      • Verity
      • adobe
      • conferences
      • openbd
      • radio
      • redhat
      • regex
      • Balsamiq
      • Ben Forta
      • CFBuilder
      • CFUG
      • Napkee
      • SOLR
      • SOTR
      • air
      • android
      • austin
      • beta
      • blog
      • bufferapp
      • centos
      • cfeclipse
      • chrome
      • clustering
      • css
      • curation
      • customer service
      • database
      • dutch
      • feedback
      • filter
      • filtering
      • funding
      • git
      • gmail
      • google
      • humor
      • i18N
      • iphone
      • jQM
      • java
      • jrun
      • json
      • microsoft
      • mobile
      • movies
      • mxunit
      • nosql
      • ocfs
      • open office
      • pitch
      • public speaking
      • rhel
      • scalability
      • sql server
      • svn
      • ui
      • unit testing
      • ux
      • web
      • webapp
      • website
      • websockets
      • windows
      • work
      • youtube
    • Archive

      • 2012 (6)
        • February (4)
        • January (2)
      • 2011 (57)
        • September (2)
        • August (1)
        • July (4)
        • June (3)
        • May (9)
        • April (5)
        • March (14)
        • February (8)
        • January (11)
      • 2010 (66)
        • December (13)
        • November (11)
        • October (7)
        • September (8)
        • August (5)
        • July (2)
        • June (2)
        • May (2)
        • April (7)
        • March (3)
        • February (1)
        • January (5)
      • 2009 (21)
        • December (7)
        • November (5)
        • October (3)
        • September (5)
        • August (1)
    • Obox Design
  • Solo Web Startup

    Web developer/architect turned solo web entrepreneur

    324925 Views
  • Get Updates

    Subscribe via RSS
    TwitterTwitterFacebookPageFacebookLinkedInFriendfeedDelicious