Archive for category Databases

MySQL – limiting the result returned by Show Processlist

Use /P (pager) to limit the result you wish to see.

For example, if you wish to see all processes except the ones sleeping then do the following:


mysql> \P grep -v Sleep
mysql> show processlist;

No Comments

MySQL innodb_flush_log_at_trx_commit parameter

Setting innodb_flush_log_at_trx_commit in my.cnf file can have huge impact to the performance of your MySQL server. The impact of setting this parameter is stated below:

innodb_flush_log_at_trx_commit=X

  • Setting X to 2: causes the log buffer to be written to the log file after each commit, but file writes are flushed to disk approximately once a second. This is somewhat slower than a setting of 0. However the committed changes will not be lost if it is only the MySQL server that crashes and not the operating system or server host. In this scenario the machine continues to run, so the changes written to the log file are in the filesystem cache and eventually will be flushed normally.

  • Setting X to 1: changes are written from log buffer to the log file and the log file is flushed to disk for each commit. This guarantees that the changes will not be lost even in the event of a crash. This is the safest setting, and is also the required setting if you need ACID durability. It is also the slowest setting.

  • Setting X to 0: causes the log file to be written and flushed to disk approximately once a second, but not after each commit. on a busy system, this can reduce log-related disk activity significantly, but in the event of a crash can result in loss of about a second’s worth of commited changes.

No Comments

MySQL – Return table creation time along with other information

Quick query to return the time of creation along with some other info for a table:

select TABLE_NAME, TABLE_TYPE,  ENGINE, TABLE_ROWS, CREATE_TIME from TABLES where TABLE_NAME = 'my_table';

No Comments

What’s New and Upcoming in HDFS

Another great presentation on Hadoop!

http://www.slideshare.net/cloudera/hdfs-update-lipcon-federal-big-data-apache-hadoop-forum

No Comments

Big Data Security with Hadoop

Great presentation!

http://www.slideshare.net/cloudera/big-data-security-apache-accumulo-echeveriafederal-big-data-apache-hadoop-forum

1 Comment

How to check on HBase Master Status?

There is no direct way to find the status. There are two ways I found that would clearly indicate the status:

  1. Web UI - Comes with the installation.

  2. Zookeeper cli

As mentioned in the previous post, “The HBase master publishes its location to clients via Zookeeper. This is done to support multimaster operation (failover).”

Its important to note that any change sto HBase Master will take place within the timeout mentioned in zookeeper.session.timeout.

If you’re like us and would like to create scripts to check on the status of the HBase Master then Zookeeper commands is the best way to find the HBase Master and the status. Here are some useful commands:

Connect to Zookeeper through HBase:

  • hbase zkcli -server ${ZKHOST}:${ZKPORT}
    ZKHost: Zookeeper Host
    ZKPORT: Zookeeper Port. Default is 2181.

Find Master Node controlling zookeeper:

  • Once connected to Zookeeper using above command then you can run the following to get the Master Node which is controlling Zookeeper at this time:

    get /hbase/master

    Sample output:

    [zk: localhost:2181(CONNECTED) 0] get /hbase/master
    �18182@nodedb2.hbdomainnodedb2.hbdomain,60000,1361996502780
    cZxid = 0xa0000000a
    ctime = Wed Feb 27 12:21:22 PST 2013
    mZxid = 0xa0000000a<br />
    mtime = Wed Feb 27 12:21:22 PST 2013
    pZxid = 0xa0000000a
    cversion = 0
    dataVersion = 0
    aclVersion = 0
    ephemeralOwner = 0x23d1d50b3980001
    dataLength = 65
    numChildren = 0


    If none of the master nodes are available then you get the following error:

    [zk: localhost:2181(CONNECTED) 1] get /hbase/master
    Node does not exist: /hbase/master


Find backup Master Nodes waiting to take control:

  • Once connected to Zookeeper through HBase then you can run the following command to return the list of Backup Master Nodes:

    ls /hbase/backup-masters

    Sample Output:

    [zk: localhost:2181(CONNECTED) 7] ls /hbase/backup-masters
    [db-hb1.hbdomain,60000,1362010238030]

    If backup Master Nodes are not available or down:

    [zk: localhost:2181(CONNECTED) 8] ls /hbase/backup-masters
    []

No Comments

HBase Master

HBase Architecture

  • For more information on HBase Architecture, refer to this link

HBase Master

  • HMaster is the implementation of the Master Server. The Master server is responsible for monitoring all RegionServer instances in the cluster, and is the interface for all metadata changes.

  • The multi-master feature introduced in 0.20.0 does not add cooperating Masters; there is still just one working Master while the other backups wait. For example, if you start 200 Masters only 1 will be active while the others wait for it to die. The switch usually takes zookeeper.session.timeout plus a couple of seconds to occur.
    http://wiki.apache.org/hadoop/Hbase/MultipleMasters

At Start Up:

  • If run in a multi-Master environment, all Masters compete to run the cluster. If the active Master loses its lease in ZooKeeper (or the Master shuts down), then then the remaining Masters jostle to take over the Master role.

  • The HBase master publishes its location to clients via Zookeeper. This is done to support multimaster operation (failover). So if the HBase master self-discovers its location as a localhost address, then it will publish that. Region servers or clients which go to Zookeeper for the master location will get back an address in that case only useful if they happen to be co-located with the master.
    http://www.mail-archive.com/hbase-user@hadoop.apache.org/msg07296.html

  • What happens if the Hmaster goes down?
  • A common dist-list question is what happens to an HBase cluster when the Master goes down. Because the HBase client talks directly to the RegionServers, the cluster can still function in a “steady state.” Additionally, per Section 9.2, “Catalog Tables” ROOT and META exist as HBase tables (i.e., are not resident in the Master). However, the Master controls critical functions such as RegionServer failover and completing region splits. So while the cluster can still run for a time without the Master, the Master should be restarted as soon as possible.

No Comments

Big Data – Zookeeper, HDFS, HBase…

It seems BigData is all everyone is talking about these days. So I’m going to start my posts on the subject by mentioning some basics about the platform:

No Comments

How to create a new SQL Database Broker?

To Enable SQL Database broker simply run the following TSQL command:

ALTER DATABASE db_name SET NEW_BROKER

No Comments

The Service Broker in database “db_name” cannot be enabled because there is already an enabled Service Broker with the same ID.

Getting the following error when enabling the broker.

The Service Broker in database “db_name” cannot be enabled because there is already an enabled Service Broker with the same ID.

The above error happens when the database was restored from a database that already has the broker enabled. To fix this you need to create a new broker using:

ALTER DATABASE db_name SET NEW_BROKER

No Comments