Pencarian

Rss Posts

 

 

 

Berita pada kategori ‘Basisdata’

MySQL Cluster: 5 Steps to Getting Started, then 5 More to Scale for the Web

Sep 03, 2010

Join us for a live and interactive webinar session where we will demonstrate how to start an evaluation of the
MySQL Cluster database in 5 easy steps, and then how to expand your
deployment for web & telecoms-scale services.Just register here: http://www.mysql.com/news-and-events/web-seminars/display-566.htmlGetting Started will describe how to:

Get the softwareInstall itConfigure itRun itTest it

Scaling for HA and the web will describe how to:

Review the requirements for a HA configurationInstall the software on more serversUpdate & extend the configuration from a single host to 4Roll out the changesOn-line scaling to add further nodesWhen: Wednesday, September 08, 2010: 09:00 Pacific time (America)

Wed, Sep 08: 11:00 Central time (America)

Wed, Sep 08: 12:00 Eastern time (America)

Wed, Sep 08: 16:00 UTC

Wed, Sep 08: 17:00 Western European time

The presentation will be approximately 45 minutes long followed by Q&A.

dbbenchmark.com ? configuring OpenBSD for MySQL benchmarking

Sep 03, 2010

Here are some quick commands for installing the proper packages and requirements for the MySQL dbbenchmark program.

export PKG_PATH=”ftp://openbsd.mirrors.tds.net/pub/OpenBSD/4.7/packages/amd64/”
pkg_add -i -v wget
wget http://dbbenchmark.googlecode.com/files/dbbenchmark-version-0.1.beta_rev26.tar.gz
pkg_add -i -v python
Ambiguous: choose package for python
a 0:
1: python-2.4.6p2
2: python-2.5.4p3
3: python-2.6.3p1
Your choice: 2

pkg_add -i -v py-mysql
pkg_add -i -v mysql
pkg_add -i -v mysql-server
ln -s /usr/local/bin/python2.5 /usr/bin/python
gzip -d dbbenchmark-version-0.1.beta_rev26.tar.gz
tar -xvf dbbenchmark-version-0.1.beta_rev26.tar
cd dbbenchmark-version-0.1.beta_rev26
./dbbenchmark.py –print-sql
– login to mysql and execute sql commands
./dbbenchmark.py

MySQL 5.1 Plugins Development Published

Aug 28, 2010

MySQL 5.1 has a great feature which not many people know about, that is the fact it can be extended via. the use of plugins. ?Unfortunately how you go about this is not incredibly well documented. ?You can search for examples on the internet, dig through the MySQL source code and ask on the forums and you may figure it all out. ?But doing all this is time consuming and could easily put someone off. So Sergei Golubchik and I have got together to bring you this book which will show you, using examples, how to write your own plugins.
We start by explaining the UDF API which has been around for a long time, and then move on to Daemon Plugins, Information Schema Plugins, Full-text Search Plugins and Storage Engine Plugins. ?Each with?usable?examples.
MySQL 5.1 Plugins Development has just been published by Packt Publishing and I believe it is well worth a look if you are thinking of extending MySQL.

MariaDB 5.1.49 for Mac OS X

Aug 27, 2010


Stuttgart: a rainy day, waiting for Iftar. Good time for good news!
During my vacation I read about a request for a MariaDB package for Mac OS X? and did some research. Back from vacation I have an alpha version of MariaDB package for Mac OS X for? our community to test.
Caution: this is the first installer I ever wrote on a Mac, so use it on a test system only!
I would like you to test the installer and provide us with feedback.
Known issues in the MariaDB installer:

The Preferences Pane app for starting/stopping the server instance is missing
The installer for setting up MariaDB as a Startup Item is missing.

Side node: while digging into the Mac installer I found two
bugs in the MySQL Mac OS X installer.

http://bugs.mysql.com/bug.php?id=56279

Mac installer does not work as documented

http://bugs.mysql.com/bug.php?id=56280

Mac installer’s postflight script does not work all the time

You can grab the package from here:

http://lisas.de/~hakan/file/mariadb-5.1.49-osx10.6-x86.pkg

MySQL GIS ? Part 1

Aug 23, 2010

TweetIn my business (weather) we use lots map based (Geo) information.? Almost every table has latitude and longitude. Working with this kind of data can be exciting and frustrating.? This should give you a quick start into GIS with MySQL.
“A geographic information system (GIS), or geographical information system, is any system that captures, stores, analyzes, manages, and presents data that are linked to location. In the simplest terms, GIS is the merging of cartography, statistical analysis, and database technology. GIS systems are used in cartography, remote sensing, land surveying, utility management, natural resource management, photogrammetry, geography, urban planning, emergency management, navigation, and localized search engines.” – Wikipedia
GIS / Mapping Systems work with both text data and graphical data.? Applications and utilities often blur the lines between the two types and make understanding difficult.? Map servers blend raster images, with point or polygon data, and bitmap images to make complete images to display in the user’s client application.? For this post I will concentrate on the text type “data”.? The type we can index in a MySQL database.
THE SEARCH
After months of reading, [1]? I’m writing this post to describes what I have learned about how to get started using GEO coding data as quickly as possible.? I found very little piratical information on GIS and MySQL.? The MySQL manual covers the functions but doesn’t supply much practical information on GEO.? Anders Karlsson wrote a nice and short story about GIS that give me a good start.
The best information has be written by Florin Duroiu in his post titled “Political boundaries overlay in Google maps”. A good part of my post is based on his work.
STEP BY STEP
Below are the detailed needed to to produce a MySQL database with the Points of Interest (POI).? This is based on CentOS 5.5 with MySQL 5.1.
yum install gdal
mkdir geo
mkdir data
mkdir data/Oklahoma
cd geo/data/Oklahoma
wget http://downloads.cloudmade.com/north_america/united_states/oklahoma/oklahoma.shapefiles.zip
unzip oklahoma.shapefiles.zip
mysql -e ‘create database geo’
ogr2ogr -f “MySQL” MySQL:”geo,user=root,host=localhost,password=” -nln oklahoma_poi -lco engine=MYISAM oklahoma_poi.shp
mysql geo -e ‘desc oklahoma_poi’

+———-+————–+——+—–+———+—————-+
| Field | Type | Null | Key | Default | Extra |
+———-+————–+——+—–+———+—————-+
| OGR_FID | int(11) | NO | PRI | NULL | auto_increment |
| SHAPE | geometry | NO | MUL | NULL | |
| category | varchar(30) | YES | | NULL | |
| name | varchar(113) | YES | | NULL | |
+———-+————–+——+—–+———+—————-+

mysql geo -e “select name, category, Y(SHAPE) as lat, X(SHAPE) as lng from oklahoma_poi where NAME like ‘School:Putnam%’”

+————————————-+——————————–+————+————-+
| name | category | lat | lng |
+————————————-+——————————–+————+————-+
| School:Putnam City West High School | Government and Public Services | 35.492557 | -97.6605975 |
| School:Putnam City North School | Government and Public Services | 35.5892209 | -97.6372648 |
| School:Putnam City School | Government and Public Services | 35.5122794 | -97.6142079 |
| School:Putnam High School | Government and Public Services | 35.5214459 | -97.6086523 |
| School:Putnam Heights Academy | Government and Public Services | 35.5081143 | -97.5397619 |
+————————————-+——————————–+————+————-+

In a coming set of post I’ll go over:

The “Data” types your will find and how to convert between them.
What data is available and where can you find it?
More examples on what you can do with GIS data.
Viewing our GIS data.
How to collect your own GIS data.
Good and bad examples of searching GIS data.
Optimizing MySQL GIS.? Is it really worth using?

[1] Books: GIS for Dummies – Author: Michael N. DeMers – John Wiley & Sons?(2009) – ISBN: 0470236825
Open Source GIS: A GRASS GIS Approach. Third Edition.-? Author: Markus Neteler and Helena Mitasova – ISBN: 978-0-38735767-6
Web Mapping Illustrated: Using Open Source GIS Toolkits – Author: Tyler Mitchell – ISBN: 9780596008659
?

InnoDB memory allocation, ulimit, and OpenSUSE

Aug 23, 2010

I recently encountered an interesting case. A customer reported that mysqld crashed on start on OpenSUSE 11.2 kernel 2.6.31.12-0.2-desktop x86_64 ? with 96 GB RAM when the innodb_buffer_pool_size was set to anything more than 62 GB. I decided to try it with 76 GB. The error message was an assert due to a failed malloc() in ut_malloc_low() in ut/ut0mem.c inside InnoDB source code. InnoDB wraps the majority of its memory allocations in ut_malloc_low(), so to get an idea of the pattern of requested allocations I added a debugging fprintf() to tell me how much was being allocated and whether it was successful.
I discovered something interesting. I expected the allocation to fail on the 76 GB of the buffer pool, due to some weird memory mapping issue and a continuous block of 76 GB not being available. However, that is not what happened. 76 GB buffer was allocated successfully. What was failing is the allocation of 3.37GB after that. What in the world could InnoDB need that was 3.37 GB? There was nothing in the settings that asked for anything close to 3 GB explicitly.
Source code is the ultimate documentation, and I took advantage of that. My good friend GDB guided me to buf_pool_init() in buf/buf0buf.c. There I found the following:
buf_pool->frame_mem = os_mem_alloc_large(
UNIV_PAGE_SIZE * (n_frames + 1),
TRUE, FALSE);
That was the buffer pool itself, the 76 GB of it. And now the buffer pool’s friend:
buf_pool->blocks = ut_malloc(sizeof(buf_block_t) * max_size);
3.6 GB of it!
From the comments in the code (InnoDB code actually has very good comments), max_size is the maximum number of buffer pool pages (16K each), n_frames which is the same thing unless AWE is used, but it was not used, so I did not worry about it.
What shall we call that friend? It is used for storing some meta information about buffer pool pages. The most natural name I could come up with from reading the source code is the blocks array.
Thus we can see that we are allocating another chunk that is in proportion to the setting of innodb_buffer_pool_size for the blocks array. The exact proportions will probably vary from version to version, but roughly about 1 G for every 25 G of the buffer pool. This can become significant in the proper innodb_buffer_pool_size estimations when the system has a lot of RAM and you want to have the largest possible innodb_buffer_pool_size. Do not forget to give the blocks array some room!
While this was an interesting investigation, it nevertheless did not explain why there was not enough room for a 76 GB buffer pool. Even with the extra 3.37 GB allocation, there was still some free memory. Or was there? Maybe some hidden monster was eating it up? I quickly wrote this hack to prove or disprove the monster’s presence.
I verified that I could allocate and initialize two chunks of 40 GB from two separate processes, but not 80 GB from one. In fact, 80GB allocation failed right in malloc(), did not even get to initialization. I tested it with allocating 70 GB concurrently in each process so as to overrun physical memory + swap. Both allocations were successful, one initialized successfully, the other was killed by the OOM kill during initialization.
This smelled like a low ulimit, and sure enough it was. ulimit -m ulimited; ulimit -v unlimited did the magic, and mysqld successfully started with an 80 GB buffer pool. Apparenly OpenSUSE defaults are set in proportion to physical memory to keep the memory-hungry applications from taking the system down. On this particular system (96 GB physical memory, 2 GB swap it decided to set the virtual memory ulimit (-v) to 77.27 GB, and the physical memory (-m) to 80.40 GB).

Entry posted by sasha |
No comment
Add to: | | | |

Why GRANT ALL is bad

Aug 06, 2010

A common observation for many LAMP stack products is the use of poor MySQL security practices. Even for more established products such as Wordpress don’t always assume that the provided documentation does what it best for you. As per my earlier posts where I detailed installation instructions and optimal permissions for both WordPress and Mediawiki, and not just directed readers to online documentation.
In this post I will detail why GRANT ALL is bad.
Let’s start with what GRANT ALL [PRIVILEGES] provides. As per the MySQL 5.1 Reference Manual you get the following:
ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE USER, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, FILE, INDEX, INSERT, LOCK TABLES, PROCESS, REFERENCES, RELOAD, REPLICATION CLIENT, REPLICATION SLAVE, SELECT, SHOW DATABASES, SHOW VIEW, SHUTDOWN, SUPER, TRIGGER, UPDATE, USAGE
I am going to focus on just one privilege that is included with ALL, and that is SUPER. This privilege can do the following which can be destructive for an application level user:

Bypasses read_only
Bypasses init_connect
Can Disable binary logging
Change configuration dynamically
No reserved connection

User Permissions
This is how a user should be created, granting only the required permissions to a given schema.

CREATE USER goodguy@localhost IDENTIFIED BY ’sakila’;
GRANT CREATE,SELECT,INSERT,UPDATE,DELETE ON odtug.* TO goodguy@localhost;

This is what is commonly seen.

CREATE USER superman@’%';
GRANT ALL ON *.* TO superman@’%';

Bypasses read_only
Many MySQL replication environments rely on ensuring the MySQL slave is consistent with the master. Did you know that an application can bypass this security when read_only=true is used?

$ mysql -ugoodguy -psakila odtug
mysql> insert into test1(id) values(1);
ERROR 1290 (HY000): The MySQL server is running with the –read-only option so it cannot execute this statement

$ mysql -usuperman odtug
mysql> insert into test1(id) values(1);
Query OK, 1 row affected (0.01 sec)

GRANT ALL is bad for data consistency.
Bybasses init_connect
A common practices used for UTF8 communications is to use the init_connect configuration variable.

#my.cnf
[client]
init_connect=SET NAMES utf8

$ mysql -ugoodguy -psakila odtug

mysql> SHOW SESSION VARIABLES LIKE ‘ch%’;
+————————–+———-+
| Variable_name | Value |
+————————–+———-+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
+————————–+———-+

$ mysql -usuperman odtug

mysql> SHOW SESSION VARIABLES LIKE ‘character%’;
+————————–+———-+
| Variable_name | Value |
+————————–+———-+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
+————————–+———-+

GRANT ALL is bad for data integrity.
Disables Binary Logging.

$ mysql -usuperman odtug

mysql> SHOW MASTER STATUS;
+——————-+———-+————–+——————+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+——————-+———-+————–+——————+
| binary-log.000001 | 354 | | |
+——————-+———-+————–+——————+

mysql> DROP TABLE time_zone_leap_second;
mysql> SET SQL_LOG_BIN=0;
mysql> DROP TABLE time_zone_name;
mysql> SET SQL_LOG_BIN=1;
mysql> DROP TABLE time_zone_transition;
mysql> SHOW MASTER STATUS;
+——————-+———-+————–+——————+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+——————-+———-+————–+——————+
| binary-log.000001 | 674 | | |
+——————-+———-+————–+——————+

$ mysqlbinlog binary-log.000001 –start-position=354 –stop-position=674

# at 354
#100604 18:00:08 server id 1 end_log_pos 450 Query thread_id=1 exec_time=0 error_code=0
use mysql/*!*/;
SET TIMESTAMP=1275688808/*!*/;
DROP TABLE time_zone_leap_second
/*!*/;
# at 579
#100604 18:04:31 server id 1 end_log_pos 674 Query thread_id=2 exec_time=0 error_code=0
use mysql/*!*/;
SET TIMESTAMP=1275689071/*!*/;
DROP TABLE time_zone_transition
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;

Should that statement be run on MySQL Slaves?
Is the binary log used for any level of auditing?
GRANT ALL is bad for slave consistency.
The reserved connection
MySQL reserved one connection for an administrator to be able to login to a server. For example.

$ mysql -uroot

mysql> show global variables like ‘max_connections’;
+—————–+——-+
| Variable_name | Value |
+—————–+——-+
| max_connections | 3 |
+—————–+——-+
1 row in set (0.07 sec)

mysql> show global status like ‘threads_connected’;
+——————-+——-+
| Variable_name | Value |
+——————-+——-+
| Threads_connected | 4 |
+——————-+——-+

mysql> SHOW PROCESSLIST;
+—-+——+———–+——-+———+——+————+—————
| Id | User | Host | db | Command | Time | State | Info
+—-+——+———–+——-+———+——+————+—————
| 13 | good | localhost | odtug | Query | 144 | User sleep | UPDATE test1 …
| 14 | good | localhost | odtug | Query | 116 | Locked | select * from test1
| 15 | good | localhost | odtug | Query | 89 | Locked | select * from test1
| 15 | root | localhost | odtug | Query | 89 | Locked | SHOW PROCESSLIST

However if all application users are already using the SUPER privilege, the administrator will get.

$ mysql -uroot
ERROR 1040 (HY000): Too many connections

There is no way to be able to login and see what’s happening, or kill threads for example. In this case you either wait, or you are required to kill the mysqld process, or fine the client threads to kill. The result of the former may lead to a corrupt database requiring additional recovery.
GRANT ALL is bad for system administration and monitoring.
Conclusion
Don’t use GRANT ALL for application users. For more information, including why I only listed just 5 issues, check out my MySQL Idiosyncrasies that BITE presentation. I will also be presenting this talk at MySQL Sunday at Oracle Open World in September.

Rails on PostgreSQL: PGCon 2010 talk on Rails and PostgreSQL

Aug 03, 2010

A while back I posted a link to a talk by Gleb Arshinov that he gave at the SF PUG. This talk was on “PostgreSQL for high performance Rails apps”, and was full of fine suggestions from their experiences with their Rails apps.

Gleb is back again, this time on May 21 2010 at PGCon where he and Alexander Dymo talked about PostgreSQL as a secret weapon for Rails apps. Some of the same ground is covered (use SQL DDL vs ActiveRecord create_table, etc), but there’s lots of new information too. Here are some notes:

  • 1:10 They’re using PostgreSQL 8.4, nginx, and mongrel
  • 4:00-6:00 Talks about dropping down into SQL via ActiveRecord
  • 6:30 Use include to eliminate N+1 queries.
  • 7:30 Watch for things like acts_as_tree that reintroduce lots of queries in exchange for the improvement in abstraction.
  • 9:00 One query, 12 joins – complicated, but query time goes from 8 seconds to 60 ms.
  • 14:00-17:00 A technique for recording SQL queries; this helps ensure you’re not running unexpected queries
  • 19:00 Suggests use straight SQL for DDL rather than the ActiveRecord DSL
  • 20:00 Use constraints, FKs, etc to preserve data integrity – “anything you don’t have a constraint on will get corrupted”
  • 23:00 Don’t use CASCADE since app won’t know about the deletions
  • 28:00 Keep a log of times for the most frequent user requests. Alex suggests using integration tests for this; code is at 29:10 and 29:30.
  • 32:30 A technique for loading data with ActiveRecord’s select option with PostgreSQL arrays to save on object creation. Questions from the audience about normalization vs efficiency.
  • 38:50 Role/user/privilege checking can be slow; shows a technique for using PostgreSQL’s bool_or and GROUP BY to get the data in one fell swoop. Query time went from 2+ seconds to 64 ms.
  • 42:00 Do analytics in the database. Saw speed improve from 90s to 5s and saved tons of RAM.
  • 44:40 Some excellent new PostgreSQL features that are either here now or are on the way (replication, windowing functions)
  • 46:30 Demonstrates a problem with PostgreSQL’s LIMIT and OFFSET when used with subselects. Some discussion of pagination with the audience. Here’s an excellent discussion of pagination alternatives written by Justin French.
  • 50:30 How to force PostgreSQL to use a subselect vs a join; the example goes from 605ms to 325 ms.
  • 52:20 Be careful with generate_series. Apparently these functions cannot generate hints for the planner.
  • 55:30 General props to PostgreSQL community.
  • 59:40 Need to test queries both in cold state and hot state; they saw 14x speed difference.
  • 1:01:40 Tune PostgreSQL – shared_buffers, work_mem, autovacuum, etc. Rely on community knowledge for initial configuration.

Lots of good stuff there, enjoy!

MySQL Master HA at Yahoo

Aug 03, 2010

I was asked to write a blog post about MySQL High Availability at Yahoo, particularly for writes. Our standard practice is not particularly high-tech, but we’ve been using it for over 4 years now and it has become a company-wide standard with a few exceptions.
 
Let me start by saying that at Yahoo! we consider a datacenter as a Single Point of Failure (SPoF). We build and manage many of our own datacenters, and we still don’t assume they are invulnerable. How many people can attest to the fact that however to configure your racks, how many redundant switches, power supplies, drives, etc. you buy, if your leased datacenter has power or network issues, you are at their mercy.
 

read more

Tips for taking MySQL backups using LVM

Aug 03, 2010

LVM uses copy-on-write to implement snapshots. Whenever you’re writing data to some page, LVM copies the original page (the way it looked like when the snapshot was taken) to the snapshot volume. The snapshot volume must be large enough to accommodate all pages written to for the duration of the snapshot’s lifetime. In other words, you must be able to copy the data somewhere outside (tape, NFS, rsync, etc.) in less time than it would take for the snapshot to fill up.
While LVM allows for hot backups of MySQL, it still poses an impact on the disks. An LVM snapshot backup may not go unnoticed by the MySQL users.
Some general guidelines for making life easier with LVM backups follow.
Lighter, longer snapshots
If you’re confident that you have enough space on your snapshot volume, you may take the opportunity to make for a longer backup time. Why? Because you would then be able to reduce the stress from the file system. Use ionice when copying your data from the snapshot volume:

ionice -c 2 cp -R /mnt/mysql_snapshot /mnt/backup/daily/20100719/

Are you running out of space?
Monitor snapshot’s allocated size: if there’s just one snapshot, do it like this:

Every 10.0s: lvdisplay | grep Allocated????????????????????????????????????????????????????????????????????????????????????????????????????????????????? Mon Jul 19 09:51:29 2010

Allocated to snapshot? 3.63%

Don’t let it reach 100%.
Avoid running out of space
To make sure you don’t run out of snapshot allocated size, stop all administrative scripts.

Are you running your weekly purging of old data? You will be writing a lot of pages, and all will have to fit in the snapshot.
Building your reports? You may be creating large temporary tables; make sure these are not on the snapshot volume.
Rebuilding your Sphinx fulltext index? Make sure it is not on the snapshot volume, or postpone till after backup.

You will gain not only snapshot space, but also faster backups.
Someone did the job before you
Use mylvmbackup: the MySQL LVM backup script by Lenz Grimmer. Or do it manually: follow this old-yet-relevant post by Peter Zaitsev.