Installing Lighttpd With PHP5 And MySQL Support On Ubuntu 10.10
Nov 30, 2010
Guyub adalah perusahaan TI berpusat di Palembang dengan fokus pada F/OSS Produk-produk >> Layanan-layanan >>
Nov 05, 2010
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
While this works, there are two scalability limitations. First is the network overhead of the back and forth of each SQL statement, the second is the synchronous nature, that is your code can not continue until your INSERT is successfully completed.
The first improvement is to use MySQL’s multi values capability with INSERT. That is you can insert multiple rows with a single INSERT statement. For example:
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?), VALUES (?, ?, ?), (?, ?, ?);
Depending on how you collect the information to be inserted, you may be able to easily implement this. The benefit, as previously mentioned in The value of multi insert values shows an 88% improvement in performance.
One system variable to be aware of is max_allowed_packet. You may have to adjust this for larger INSERT statements.
Next is the ability to INSERT data based on information already in another table. You can also leverage for example another storage engine like MEMORY to batch up data to be inserted via this approach.
INSERT INTO table (col1, col2, col3) SELECT col1, col2, col3 FROM anothertable
The third option is to batch load your data from a flat file. Depending on how you source the information you are inserting, that may also be a significant improvement in throughput in bulk inserting data.
LOAD DATA [CONCURRENT] INFILE ‘file’
INTO TABLE (col1, col2, col3)
On a closing note, the choice of storage engine can also have a significant effect on INSERT throughput. MySQL also offers other non ANSI options including DELAYED, IGNORE and LOW_PRIORITY. These should definitely be avoided.
Nov 01, 2010
Open source has gone a long way toward putting power back in the hands of developers, who can download, install and deploy software without having to go through any kind of?convoluted sales or budget approval process. ?You want?MySQL? ?You can download and install?in?15 minutes, and you don?t have to talk to anyone to do it.
Software as a service (SaaS) takes this to an even broader audience, enabling employees to get the kind of lightweight, consumer, self-serve capabilities in their job without even having to run their own servers. ?Platforms like?Amazon AWS, Heroku, Makara, RightScale?and others put this same kind of SaaS power in the hands of developers…
My view: ease of use trumps a long feature list any day of the week. There are both techological reasons as well as sociological and economic reasons for why organizations are seeking greater simplicity. ?Part of this stems from the fact that complex enterprise applications grew beyond the ability of most organizations to successfully adopt. ?
Head over to GigaOm for the full post.
Sep 03, 2010
Get the softwareInstall itConfigure itRun itTest it
Scaling for HA and the web will describe how to:
Review the requirements for a HA configurationInstall the software on more serversUpdate & extend the configuration from a single host to 4Roll out the changesOn-line scaling to add further nodesWhen: Wednesday, September 08, 2010: 09:00 Pacific time (America)
Wed, Sep 08: 11:00 Central time (America)
Wed, Sep 08: 12:00 Eastern time (America)
Wed, Sep 08: 16:00 UTC
Wed, Sep 08: 17:00 Western European time
The presentation will be approximately 45 minutes long followed by Q&A.
Sep 03, 2010
export PKG_PATH=”ftp://openbsd.mirrors.tds.net/pub/OpenBSD/4.7/packages/amd64/”
pkg_add -i -v wget
wget http://dbbenchmark.googlecode.com/files/dbbenchmark-version-0.1.beta_rev26.tar.gz
pkg_add -i -v python
Ambiguous: choose package for python
a 0:
1: python-2.4.6p2
2: python-2.5.4p3
3: python-2.6.3p1
Your choice: 2
pkg_add -i -v py-mysql
pkg_add -i -v mysql
pkg_add -i -v mysql-server
ln -s /usr/local/bin/python2.5 /usr/bin/python
gzip -d dbbenchmark-version-0.1.beta_rev26.tar.gz
tar -xvf dbbenchmark-version-0.1.beta_rev26.tar
cd dbbenchmark-version-0.1.beta_rev26
./dbbenchmark.py –print-sql
– login to mysql and execute sql commands
./dbbenchmark.py
Aug 24, 2010
Max Lifshin, an Android developer living in Russia, says his Tap Snake program is not a Trojan or virus, despite a warning from security software maker Symantec last week. Lifshin has been vilified in the press for releasing the program, which was intended for parents to track their children?s whereabouts. Reached by ZDNet for comment, Lifshin insisted that his motivations were innocent: The app is no more malicious than a motion detection camera – everything depends on the user?s intentions. It gives all the proper warnings and requires a set up, a conscious action, to report location. It can be easily used by mothers worrying about their kids? whereabouts. In fact, I suspect the majority of users were indeed the mothers. For the program to work, the parent or guardian downloads and installs the innocuous looking game on their kid?s phone. During the installation process, Android asks for permission to access location information and to send and receive information to the Internet. After accepting these terms, the parent must open up a menu option and activate the tracking service with a unique key. Then they give the phone back to their child. From that point on, the game will occasionally upload its location to a server, where only somebody with the key can view it. Lifshin says: The app is not really very different from Google?s Latitude. As any technology product, it can be put to either noble or malicious ends. The game can be uninstalled at any time. The program run by the parent to view location information is called GPS Spy. The Market description for GPS Spy openly explained how all this works, saying: Download and install the free Tap Snake game from the Market to the phone you want to spy on. Press MENU and register the Snake with the service. Use the GPS Spy app on your phone with the same email/code to track the location of the other phone. However, the description of the Tap Snake game did not say anything about tracking, presumably so your child could look up the game for updates or reviews and be none the wiser. Until recently, Tap Snake was a free download and GPS Spy was $4.99. After the news came out, Google removed both apps from the Market. According to Lifshin,
What?s sad is that these ?whistle blowers? have prompted Google to suspend the app and thus deprived me of income. They unfairly classified this app as a Trojan and portrayed me as a villain, a malicious Russian developer working in the shadows. What do you think: is this a dangerous Trojan or a useful safety device for parents? Was Google right to ban it? Speak up in the Talkback section below.
Aug 06, 2010
Bypasses read_only
Bypasses init_connect
Can Disable binary logging
Change configuration dynamically
No reserved connection
User Permissions
This is how a user should be created, granting only the required permissions to a given schema.
CREATE USER goodguy@localhost IDENTIFIED BY ’sakila’;
GRANT CREATE,SELECT,INSERT,UPDATE,DELETE ON odtug.* TO goodguy@localhost;
This is what is commonly seen.
CREATE USER superman@’%';
GRANT ALL ON *.* TO superman@’%';
Bypasses read_only
Many MySQL replication environments rely on ensuring the MySQL slave is consistent with the master. Did you know that an application can bypass this security when read_only=true is used?
$ mysql -ugoodguy -psakila odtug
mysql> insert into test1(id) values(1);
ERROR 1290 (HY000): The MySQL server is running with the –read-only option so it cannot execute this statement
$ mysql -usuperman odtug
mysql> insert into test1(id) values(1);
Query OK, 1 row affected (0.01 sec)
GRANT ALL is bad for data consistency.
Bybasses init_connect
A common practices used for UTF8 communications is to use the init_connect configuration variable.
#my.cnf
[client]
init_connect=SET NAMES utf8
$ mysql -ugoodguy -psakila odtug
mysql> SHOW SESSION VARIABLES LIKE ‘ch%’;
+————————–+———-+
| Variable_name | Value |
+————————–+———-+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
+————————–+———-+
$ mysql -usuperman odtug
mysql> SHOW SESSION VARIABLES LIKE ‘character%’;
+————————–+———-+
| Variable_name | Value |
+————————–+———-+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
+————————–+———-+
GRANT ALL is bad for data integrity.
Disables Binary Logging.
$ mysql -usuperman odtug
mysql> SHOW MASTER STATUS;
+——————-+———-+————–+——————+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+——————-+———-+————–+——————+
| binary-log.000001 | 354 | | |
+——————-+———-+————–+——————+
mysql> DROP TABLE time_zone_leap_second;
mysql> SET SQL_LOG_BIN=0;
mysql> DROP TABLE time_zone_name;
mysql> SET SQL_LOG_BIN=1;
mysql> DROP TABLE time_zone_transition;
mysql> SHOW MASTER STATUS;
+——————-+———-+————–+——————+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+——————-+———-+————–+——————+
| binary-log.000001 | 674 | | |
+——————-+———-+————–+——————+
$ mysqlbinlog binary-log.000001 –start-position=354 –stop-position=674
# at 354
#100604 18:00:08 server id 1 end_log_pos 450 Query thread_id=1 exec_time=0 error_code=0
use mysql/*!*/;
SET TIMESTAMP=1275688808/*!*/;
DROP TABLE time_zone_leap_second
/*!*/;
# at 579
#100604 18:04:31 server id 1 end_log_pos 674 Query thread_id=2 exec_time=0 error_code=0
use mysql/*!*/;
SET TIMESTAMP=1275689071/*!*/;
DROP TABLE time_zone_transition
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
Should that statement be run on MySQL Slaves?
Is the binary log used for any level of auditing?
GRANT ALL is bad for slave consistency.
The reserved connection
MySQL reserved one connection for an administrator to be able to login to a server. For example.
$ mysql -uroot
mysql> show global variables like ‘max_connections’;
+—————–+——-+
| Variable_name | Value |
+—————–+——-+
| max_connections | 3 |
+—————–+——-+
1 row in set (0.07 sec)
mysql> show global status like ‘threads_connected’;
+——————-+——-+
| Variable_name | Value |
+——————-+——-+
| Threads_connected | 4 |
+——————-+——-+
mysql> SHOW PROCESSLIST;
+—-+——+———–+——-+———+——+————+—————
| Id | User | Host | db | Command | Time | State | Info
+—-+——+———–+——-+———+——+————+—————
| 13 | good | localhost | odtug | Query | 144 | User sleep | UPDATE test1 …
| 14 | good | localhost | odtug | Query | 116 | Locked | select * from test1
| 15 | good | localhost | odtug | Query | 89 | Locked | select * from test1
| 15 | root | localhost | odtug | Query | 89 | Locked | SHOW PROCESSLIST
However if all application users are already using the SUPER privilege, the administrator will get.
$ mysql -uroot
ERROR 1040 (HY000): Too many connections
There is no way to be able to login and see what’s happening, or kill threads for example. In this case you either wait, or you are required to kill the mysqld process, or fine the client threads to kill. The result of the former may lead to a corrupt database requiring additional recovery.
GRANT ALL is bad for system administration and monitoring.
Conclusion
Don’t use GRANT ALL for application users. For more information, including why I only listed just 5 issues, check out my MySQL Idiosyncrasies that BITE presentation. I will also be presenting this talk at MySQL Sunday at Oracle Open World in September.
Aug 03, 2010
ionice -c 2 cp -R /mnt/mysql_snapshot /mnt/backup/daily/20100719/
Are you running out of space?
Monitor snapshot’s allocated size: if there’s just one snapshot, do it like this:
Every 10.0s: lvdisplay | grep Allocated????????????????????????????????????????????????????????????????????????????????????????????????????????????????? Mon Jul 19 09:51:29 2010
Allocated to snapshot? 3.63%
Don’t let it reach 100%.
Avoid running out of space
To make sure you don’t run out of snapshot allocated size, stop all administrative scripts.
Are you running your weekly purging of old data? You will be writing a lot of pages, and all will have to fit in the snapshot.
Building your reports? You may be creating large temporary tables; make sure these are not on the snapshot volume.
Rebuilding your Sphinx fulltext index? Make sure it is not on the snapshot volume, or postpone till after backup.
You will gain not only snapshot space, but also faster backups.
Someone did the job before you
Use mylvmbackup: the MySQL LVM backup script by Lenz Grimmer. Or do it manually: follow this old-yet-relevant post by Peter Zaitsev.
Jul 21, 2010
mysql> SELECT * FROM information_schema.user_statistics WHERE user=”#mysql_system#” \G
*************************** 1. row ***************************
USER: #mysql_system#
TOTAL_CONNECTIONS: 1
CONCURRENT_CONNECTIONS: 0
CONNECTED_TIME: 446
BUSY_TIME: 74
CPU_TIME: 0
BYTES_RECEIVED: 0
BYTES_SENT: 63
BINLOG_BYTES_WRITTEN: 0
ROWS_FETCHED: 0
ROWS_UPDATED: 127576
TABLE_ROWS_READ: 4085689
SELECT_COMMANDS: 0
UPDATE_COMMANDS: 119127
OTHER_COMMANDS: 89557
COMMIT_TRANSACTIONS: 90259
ROLLBACK_TRANSACTIONS: 0
DENIED_CONNECTIONS: 1
LOST_CONNECTIONS: 0
ACCESS_DENIED: 0
EMPTY_QUERIES: 0
1 row IN SET (0.00 sec)
In this case CONNECTED_TIME is 446 second, out of this replication thread was busy (BUSY_TIME) 74 seconds which means replication capacity is 446/74 = 6
You normally would not like to measure it from the start but rather take the difference in these counters every 5 minutes or other interval of your choice.
2) Use full slow query log and mk-query-digest. This method is great for one time execution especially as it comes together with giving you the list of queries which load replication
the most. It however works only with statement level replication. You need to set long_query_time=0 and log_slave_slow_statements=1 for this method to work.
Get the log file which will include all queries MySQL server ran with their times and run mk-query-digest with filter to only check queries from replication thread:
mk-query-digest slow-log –filter ‘($event->{user} || “”) =~ m/[SLAVE_THREAD]/’ > /tmp/report-slave.txt
In the report you will see something like this as a header:
PLAIN TEXT
SQL:
# 475s user time, 1.2s system time, 80.41M rss, 170.38M vsz
# Current date: Mon Jul 19 15:12:24 2010
# Files: slow-log
# Overall: 1.22M total, 1.27k unique, 558.56 QPS, 0.37x concurrency ______
# total min max avg 95% stddev median
# Exec time 819s 1us 92s 669us 260us 120ms 93us
# Lock time 28s 0 166ms 23us 49us 192us 25us
# Rows sent 4.27k 0 325 0.00 0 1.04 0
# Rows exam 30.88M 0 1.28M 26.48 0 3.07k 0
# Time range 2010-07-19 14:35:53 to 2010-07-19 15:12:22
# bytes 350.99M 5 1022.34k 301.01 719.66 5.75k 124.25
# Bytes sen 1.94M 0 9.42k 1.67 0 110.38 0
# Killed 0 0 0 0 0 0 0
# Last errn 34.11M 0 1.55k 29.26 0 185.83 0
# Merge pas 0 0 0 0 0 0 0
# Rows affe 875.19k 0 17.55k 0.73 0.99 25.61 0.99
# Rows read 2.20M 0 14.83k 1.88 1.96 24.68 1.96
# Tmp disk 4.15k 0 1 0.00 0 0.06 0
# Tmp table 14.19k 0 2 0.01 0 0.14 0
# Tmp table 8.30G 0 2.01M 7.12k 0 117.75k 0
# 0% (5k) Filesort
# 0% (5k) Full_join
# 0% (7k) Full_scan
# 0% (10k) Tmp_table
# 0% (4k) Tmp_table_on_disk
There is a lot of interesting you can find out from this header but in relation to replication capacity – you can get replication load, which is same as “concurrency” figure (0.37x) The concurrency as reported by mk-query-digest is sum of query execution time vs time range the log file covers. In this case as we know there is only one replication thread it will be same as replication load. This gives us replication capacity of 1/0.37 = 2.70
This method should work with original MySQL Server in theory, though I have not tested it. Some versions had log_slave_slow_statements unreliable and also you may need to adjust regular expression for finding users replication thread uses.
3) Processlist Pooling This method is simple – the Slave thread has different status in Show Processlist depending on if it processes query or simply waiting. By pooling processlist frequently (for example 10 times a second) we can compute the approximate percentage the thread was busy vs idle. Of course running processlist very aggressively can be an overhead especially if it is busy system with a lot of connections
PLAIN TEXT
SQL:
mysql> SHOW processlist;
+——–+————-+———–+——+———+——+———————————————————————–+——————+
| Id | User | Host | db | Command | Time | State | Info |
+——–+————-+———–+——+———+——+———————————————————————–+——————+
| 801812 | system user | | NULL | Connect | 2665 | Waiting FOR master TO send event | NULL |
| 801813 | system user | | NULL | Connect | 0 | Has READ ALL relay log; waiting FOR the slave I/O thread TO UPDATE it | NULL |
| 802354 | root | localhost | NULL | Query | 0 | NULL | SHOW processlist |
+——–+————-+———–+——+———+——+———————————————————————–+——————+
3 rows IN SET (0.00 sec)
4) Slave Catchup/Binlog Application method. We can just get the spare server with backups restored on it and apply binary log to it. If 1 hour worth of binary logs applies for 10 minutes we have replication capacity of 6. The challenge of course having spare server around and it is quite labor intensive. At the same time it can be good measurement to take during backup recovery trials when you’re doing this activity anyway. Using this way you can also measure “cold” vs “hot” replication capacity as well as how long replication warmup takes. It is very typical for servers with cold cache to perform a lot slower then they are warmed up. Measuring times for each binary log separately should give you these numbers.
The less intrusive process which can be done in production (especially if you have slave which is used for backups/reporting etc) is to stop the replication for some time and when see how long it takes to catch up. If you paused replication for 10 minutes and it took 5 minutes to catch up your replication capacity will be 3 (not 2) because you not only had to process the events for outstanding 10 minutes but also for these 5 minutes it took to catch up. The formula is (Time_Replication_Paused+Time_Took_To_Catchup)/Time_Took_To_Catchup.
So how much of replication capacity do you need in the healthy system ? It depends a lot on many things including how fast do you need to be able to recover from backups and how much your load variance is. A lot of systems have special requirements on the time it takes to warmup too (there are different things you can do about it too). First I would measure replication capacity on 5 minute intervals (or something similar) because it tends to vary a lot. When I would suggest to ensure the loaded replication capacity is at least 3 during the peak load and 5 during the normal load. This applies to normal operational load – if you push heavy ALTER TABLE through replication they will surely get your replication capacity down for their duration.
One more thing about these methods – methods 1,2,3 work well only if replication capacity is above 1, so system is caught up. If it is less than 1, so the master writes more binary logs than slave can process they will show number close to 1. the method 4 however with work even if replication can’t ever catch up – If 1 hour worth of binary logs takes 2 hours to apply, your replication capacity is 0.5.
Entry posted by peter |
No comment
Add to: | | | |
Jul 20, 2010