sixtydoses. where od is harmless.

June 22, 2009

Installing Sun VirtualBox on FreeBSD 7.1/amd64.

Filed under: Tech — Tags: , , — od @ 12:41 am

Today I decided to install Sun VirtualBox on my FreeBSD 7.1/amd64. The installation from ports is straightforward as usual.

#csup -g -L2 /etc/ports-supfile
#cd /usr/ports/emulators/virtualbox
#make install clean

I got stuck when it failed to fetch this file Dev86src-0.16.17.tar.gz.

#wget ftp://ftp.freebsd.org/pub/FreeBSD/ports/distfiles/Dev86src-0.16.17.tar.gz
#mv Dev86src-0.16.17.tar.gz /usr/ports/distfiles/
#make install clean
#hash -r ; ldconfig

Mount proc:
#mount -t procfs proc /proc

Load vbox driver module:
#kldload vboxdrv.ko
#kldstat | grep box
13    1 0xffffffffab46a000 3fb0e    vboxdrv.ko

Launch vbox from menu or command:
#VirtualBox

Done.

June 20, 2009

Virtualbox is now available in ports!

Filed under: Tech — Tags: , — od @ 8:40 am

This could be one of the bestest news I’ve heard in months!!

Excerpt from miwi:

“Today Virtualbox was committed to the FreeBSD ports tree. After a lot of work we had a good discussion today about how stable Virtualbox is, and after the CTF with take6 we got a lot of good feedback, so it was time to commit.”

Read it here.

Woohoo!!

May 23, 2009

Edge Load Balancer Network Dispatcher – Double Collocated HA on HP-UX.

Filed under: Tech — Tags: , , , , — od @ 5:01 am

One of my recent project was to configure Edge load balancer on 2 servers in high availability (HA) environment. I rarely do Edge, but the configuration is pretty straightforward. In my past projects, Edge implementation has always been in separate boxes, which is easier compared to collocated setup. In this post I’m going to share my configuration for edge dispatcher (MAC forwarding) that resides together with web server (I’m using IHS) and WebSphere. Each server will use 1 IP address for both web server and dispatcher. The configuration is almost the same, but there were few issues that I encountered and I hope this post will be of help to those who are dealing with Edge dispatcher as well.

For typical setup of Edge load balancer servers that do not reside in the same box with web servers, the general rules are:
- Primary Edge – cluster IP aliased to its NIC.
- Standby Edge – cluster IP aliased to its loopback.
- Web Servers – cluster IP aliased to loopback.

These rules hold the same in collocated environment:
- Primary Edge – cluster IP aliased to its NIC.
- Standby Edge – cluster IP aliased to its loopback.

Collocated Edge.

Double collocated HA edge.

Say I have the following:
Cluster IP – 192.168.10.10
Cluster port – 8080
Primary Edge – 192.168.10.20
Backup Edge – 192.168.10.21


default.cfg for Primary Edge:

dscontrol set loglevel 5
dscontrol set logsize 50000000
dscontrol executor start

dscontrol executor set nfa 192.168.10.20

dscontrol highavailability heartbeat add 192.168.10.20 192.168.10.21
dscontrol highavailability backup add primary auto 8880
dscontrol highavailability reach add 192.168.10.55
dscontrol highavailability reach add 192.168.10.56

dscontrol cluster add 192.168.10.10
dscontrol port add 192.168.10.10:8080

dscontrol server add 192.168.10.10:8080:192.168.10.20
dscontrol server add 192.168.10.10:8080:192.168.10.21

dscontrol manager start manager.log 10004
dscontrol man reach set loglevel 5
dscontrol man reach set logsize 50000000
dscontrol advisor start Http 192.168.10.10:8080 Http_192.168.10.10_8080.log



default.cfg for Standby Edge:

dscontrol set loglevel 5
dscontrol set logsize 50000000

dscontrol executor start

dscontrol executor set nfa 192.168.10.21

dscontrol highavailability heartbeat add 192.168.10.21 192.168.10.20
dscontrol highavailability backup add backup auto 8880
dscontrol highavailability reach add 192.168.10.55
dscontrol highavailability reach add 192.168.10.56

dscontrol cluster add 192.168.10.10
dscontrol port add 192.168.10.10:8080

dscontrol server add 192.168.10.10:8080:192.168.10.21
dscontrol server add 192.168.10.10:8080:192.168.10.20

dscontrol manager start manager.log 10004
dscontrol man reach set loglevel 5
dscontrol man reach set logsize 50000000
dscontrol advisor start Http 192.168.10.10:8080 Http_192.168.10.10_8080.log



goActive script:

This script will remove the cluster IP from loopback and alias it to the NIC.

#!/bin/ksh

CLUSTER=192.168.10.10
LOOPBACK=lo0:1

ifconfig $LOOPBACK 0.0.0.0
dscontrol executor configure $CLUSTER



goStandby script:
This script will remove the cluster IP from NIC and alias it to the loopback.

#!/bin/ksh

LOOPBACK=lo0:1
CLUSTER=192.168.10.10
NETMASK=255.255.255.192

dscontrol executor unconfigure $CLUSTER
ifconfig $LOOPBACK $CLUSTER netmask $NETMASK up



goInOp script:
This script will remove the cluster IP from all devices (loopback and NIC).

#!/bin/ksh

CLUSTER=192.168.10.10
NETMASK=255.255.255.192

dscontrol executor unconfigure $CLUSTER
ifconfig $LOOPBACK $CLUSTER netmask $NETMASK down



The normal method to test if the high availability works smoothly is by plugging out the network cable off the edge server. I would tail the root mail (/var/mail/root) at the same time, so I could see which HA script has been triggered when the network is interrupted. Another method is to bring down the server, by rebooting it or shutting it down. With reboot you’ll only have a short time span to monitor the failover in action, but of course this depends on how long your servers take to start up.

But since this is a collocated environment, if I were to opt for either the described testing methods, I wouldn’t be able to see if the dispatcher balances all requests to both web servers accordingly (in my case I’m using the round robin algorithm). So what I did is, I manually stop the executor so that failover occurs. Note that stopping the dsserver alone won’t trigger the HA scripts. Actually it is not necessary to stop the dsserver. Well to be honest even if it’s not a collocated environment, I normally test the HA failover by stopping the executor, since normally am working remotely and plugging out the cable requires me to get the help of the sys admins. So might as well test if its really working before going through all the hassle.

One of the problem that I encountered was instability. Sometimes the dispatcher will run in the right mode (active | standby), but most of the time both will run as active. It was very unstable, no certain pattern that I could track. Even worse, sometimes when I tried ro run the dispatcher as a standalone lb, all of the incoming requests will be routed directly to the web server, skipping the dispatcher completely. I was stuck with this problem for several days when I finally figured out what the culprit is.

The ibmlb module.

Everytime when the executor is stopped, the ibmlb module will be unloaded. Everytime when the executor starts, the ibmlb module will be loaded to the kernel. I’m lucky that I have dmesg on both servers, so based from dmesg, this is how it should looked like whenever you stop and start the executor:

ibmlb DLKM successfully unloaded
ibmlb DLKM successfully loaded

But what happened was, when I stopped the executor, the ibmlb was not unloaded. The status was busy, and I’ll have to unload the module explicitly.

ibmlb DLKM successfully unloaded
ibmlb DLKM successfully loaded
ibmlb version is 06.01.00.00 – 20060515-232359 [wsbld265]
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb

I’ve not seen anything like this before (I used to configure dispatcher on AIX servers). Consider the following test cases (arp table checked from a different server that resides on the same segment):

TEST 1.

1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is UNLOADED successfully on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup goes standby. Cluster IP belongs to Primary.

TEST 2.
1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is busy and still LOADED on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup stays active. Cluster IP belongs to Primary, but all requests will skip dispatcher and go straight to the web server.


TEST 3.

1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is UNLOADED successfully on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup goes standby. Cluster IP belongs to Primary.
4) Backup down. Module ibmlb is UNLOADED successfully.
5) Backup up, running in standby mode.
6) Backup down. Module ibmlb is busy and still LOADED on backup.
7) Backup up, running in active mode (remember that Primary is also in active mode too). Cluster IP belongs to Backup, but all requests will skip the dispatcher and go straight to the web server.
8 ) Backup down. Module ibmlb is busy and still LOADED on backup. Explicitly unload the module using kcmodule command until it gets UNLOADED. Cluster IP belongs to Primary.
9) Backup up, running in standby mode.

Most of the time I won’t be able to unload it right away, until I let the server ‘rest’ for about 15 – 20 minutes, before trying to unload it again. Rebooting the server will always solve this problem (the module next state is unused). Am not sure if there’s a way to force a module to be unloaded though. As far as I know there’s no force flag for kcmodule.

I was fooled several times since I tested the splash page of the web servers from my Opera browser. I was on a different subnet, so I guess there must be a switch/router in between me and the edge servers. At times, even when the cluster IP is aliased to the Primary Edge, my browser will point to the Backup Edge since the ARP cache was not refreshed. It was so annoying since this will affect the cluster report. The rest of the testings were done by running a browser from a different server but belongs to the same subnet. At least I could clear up the ARP cache manually if I have to.

Okay probably this is my browser problem, but testing the splash page with Firefox sucks. It kept on hitting the splash page even after I’ve stopped both web servers, and cleared up the cache. It was alright with Opera though. What gives?

By the way I’m using Edge v6.1. If you check out the Edge Fixpack page here, you’ll notice that there is no patch for HP-UX. Not a single patch. Is IBM trying to say something? Don’t use Edge on HP-UX, perhaps? Lol. Anyway, IBM packed me a patch (6.1.0.35), but still it didn’t address the module issue. Am not sure if I could call it a patch though, it’s more like an installer since I had to reinstall everything.

Thanks to Robert Brown from IBM for assisting me on this ‘false alarm’ panic attack (initially I thought it was a network issue).

March 14, 2009

Fitter happier.

Filed under: Life, Tech — Tags: , , , , , — od @ 11:19 pm

Yea, there has been a lack of update. I was busy with WebLogic, and Devin.. oh well, he only writes material that meets a certain literary and relevancy standard and he is not that prolific. Lol. Guess you’ll never find him writing anything that will be tagged as rant :P

Few things happened last week..

Weblogic completed:
Am done with the project last week, which is cool. Haven’t gone through the UAT session with the users yet, so until they say “All’s good”, guess I’ll have to keep my fingers crossed. Weeee!!!!



I lost my handphone:
Yes, my cheap handphone. I can’t believe someone would want to steal it. I bought the handphone for about 300 – 400 bucks, so am surprised that someone would be so keen to steal it. How much does it worth really? 2 small packets of weed? Am not bothered about the phone. Well alright, I am bothered, because it costs money. But what I care the most is the data inside the phone. My contacts, my messages, heck I even have a few audio clips of my cat purring recorded with that phone. Oh well. Fuck you, thief.



Medical checkups:
I had a couple of medical checkups performed last Tuesday and Thursday. Last Tuesday I got all wired up for the holter monitoring test. It was something like a 24 hours of ECG (electrocardiogram) to monitor the electrical activity of my heart. I couldn’t take showers during the procedure because it’ll damage the device. So yeah, am pretty happy with the fact that I don’t have to shower and not feeling guilty about it. Am just plain lazy sometimes.

And last Thursday, I got up very early in the morning and headed to the General Hospital to get my brain checked. It was a quick EEG (electroencephalography), prolly around 20 minutes where I had to blink my eyes numerous times, with bright light flickered directly on me for a few minutes and inhale/exhale profusely for 5 minutes which was tiring since am asthmatic. And oh my, I looked horrible with my hair glued on with all the wires. And I looked even worse after the guy pulled them out. I looked like a woman who is trying too hard to impress a guy by applying extra hair gel. You know the type of hair gel that will make your hair hard and stiff? Gah. I hate it.

I’ll be getting the results this April. I hope everything will be fine, since am always fine, and anyway I agreed to go through all of these medical procedures because my sister told me to. Nothing serious.



FreeBSD upgrade:
Yeah, finally! Upgraded to 7.1. I did it last Saturday, and I was thinking to update this blog on Sunday, but guess what, the update went fine until suddenly I failed to start the X server. Hmmm.. now that’s weird, because this was not the first time I did a FreeBSD upgrade, plus, this was just a minor version upgrade, so what could go wrong? Building world, kernel and installing them went perfectly well. I didn’t forget to run mergemaster, the machine booted up well. Spent a day changing the theme, and all of the sudden it failed (yeah, ironically it failed after I spent the whole day beautifying my desktop, why didn’t it fail sooner?). I realized that it failed just after I ran portupgrade to upgrade my Opera and Firefox. Probably the modules are not synced, am not sure.

I spent the next 48 hours running portupgrade numerous times, until I finally decided to rebuild perl, xorg and xfce4. But to my surprise it failed. I reinstalled my xfce4 and its gang one by one, so they’ll all run on the same version. Reboot the machine and voila, am back. Still, some of my installed packages are broken, so I had to fix them.

So they say, if it ain’t broken, don’t fix it. I could’ve just stick to FreeBSD 7.0 and save the 48 hours from the agonizing pain of troubleshooting my machine. But seriously, it was worth it. The performance is so much better, XFCE 4.6 is brilliant with more new features and yes fellas.. flash 9 works :)



He got engaged:
Yes, he finally got engaged. AmazingCongratulations.

January 26, 2009

So it’s Gnome, Linus?

Filed under: Tech — Tags: , — od @ 10:17 am

So much of his Nazi comment that ruffles the feathers of GNOME community last time.

I have always been a loyal user of XFCE4, but if I had my druthers between Gnome and KDE, I’d prefer Gnome. KDE is nice, I love some of KDE apps, but imho, it’s kinda bloated. Very pretty, way prettier than Gnome, but bloated. But of course, I’d go for Enlightenment if I want some real fancy stuff on my desktop.

Anyways, am not any DE/WM fanatic. I just love to keep my desktop clean and minimal.

I lol’ed.



Source: Q&A: Linux founder Linus Torvalds talks about open-source identity

December 16, 2008

WebLogic 10.3 on FreeBSD?

Filed under: Tech — Tags: , , , — od @ 11:57 pm

I wish.. lol.

Am very new to weblogic, so am pretty excited to try it on my FreeBSD at home. Well ok, I just managed to install it, but couldn’t get it run. Bleargh. The problem is with the LD_LIBRARY_PATH I think. There’s no native directory under <WL_HOME>/server, so it’ll complain about the missing path when you try to start the domain or node manager. Pfftttt. I’d love to know how to fix this :(

But anyways, here’s how to install WebLogic 10g on FreeBSD 7. Just installing it, but it doesn’t work lol. I don’t know why I even bother writing this.

1 – FIrst of all, download the installer from the oracle website. Choose HP-UX as the operating system as that’ll provide you with a generic jar installer.
http://www.oracle.com/technology/software/products/ias/htdocs/wls_main.html

2 – While it’s downloading, install eclipse from the port.
cd /usr/ports/java/eclipse && make install clean

3 – Install eclipse WTP. Get it from here:
http://www.eclipse.org/downloads/download.php?file=/webtools/downloads/drops/R2.0/R-2.0.3-20080710044639/wtp-R-2.0.3-20080710044639.zip

Place the zip file at /usr/local and unzip it. It’ll place all the extracted files in the right directory.

4 – After you’re done with eclipse, you’re ready to install weblogic. Am performing this as a non root user.

I use diablo java, so running java -jar server103_generic.jar alone will not work.

My java -version:
java version “1.5.0″
Java(TM) 2 Runtime Environment, Standard Edition (build diablo-1.5.0-b01)
Java HotSpot(TM) 64-Bit Server VM (build diablo-1.5.0_07-b01, mixed mode)

So execute the installer using sun jdk directory to get it running:
/usr/local/jdk1.6.0/bin/java -Dos.name=unix -jar server103_generic.jar

You’ll have to specify the Dos unix name, else you’ll get the insufficient disk space error, which is very annoying when you actually have tons of free space. You may get lucky with this and the installation will end successfully. As for me, it stuck at 74% while creating the sample domain.

$ /usr/local/jdk1.6.0/bin/java -Dos.name=unix -jar server103_generic.jar
Extracting 0%……………………………………………………………………………………….100%
Exception in thread “Thread-14″ java.lang.OutOfMemoryError: Java heap space
at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:42)
at com.bea.plateng.common.util.JarHelper.extract(JarHelper.java:790)
at com.bea.plateng.common.util.JarHelper.extract(JarHelper.java:676)
at com.bea.plateng.common.util.JarHelper.extract(JarHelper.java:634)
at com.bea.plateng.domain.TemplateImporter.generate(TemplateImporter.java:237)
at com.bea.plateng.domain.script.ScriptExecutor$2.run(ScriptExecutor.java:2785)

The error was out of memory exception, so I decided to reinstall it. But first I have to uninstall it first since weblogic detected that it has already been installed at the target directory.I run the installer with the following command:
/usr/local/jdk1.6.0/bin/java -Xmx2G -Dos.name=unix -jar server103_generic.jar

Installation complete!

Alright, now the installation is done, you might wanna try to start the sample domain.

Ahah! Now this is the part where I got unlucky.

It complains that the port is being used, when it’s not! Hmmmmphhhhh!

<Dec 13, 2008 12:25:25 PM MYT> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file /usr/local/jdk1.6.0/jre/lib/security/cacerts.>
<Dec 13, 2008 12:25:29 PM MYT> <Error> <Server> <BEA-002606> <Unable to create a server socket for listening on channel “MedRec Local Network Channel”. The address 127.0.0.1 might be incorrect or another process is using port 7011: java.net.BindException: Can’t assign requested address.>
<Dec 13, 2008 12:25:29 PM MYT> <Error> <Server> <BEA-002606> <Unable to create a server socket for listening on channel “Default”. The address 192.168.0.1 might be incorrect or another process is using port 7011: java.net.BindException: Can’t assign requested address.>
<Dec 13, 2008 12:25:29 PM MYT> <Error> <Server> <BEA-002606> <Unable to create a server socket for listening on channel “DefaultSecure”. The address 192.168.0.1 might be incorrect or another process is using port 7012: java.net.BindException: Can’t assign requested address.>
<Dec 13, 2008 12:25:29 PM MYT> <Emergency> <Security> <BEA-090087> <Server failed to bind to the configured Admin port. The port may already be used by another process.>
<Dec 13, 2008 12:25:29 PM MYT> <Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason: Server failed to bind to any usable port. See preceeding log message for details.>
<Dec 13, 2008 12:25:29 PM MYT> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED>
<Dec 13, 2008 12:25:29 PM MYT> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down>
<Dec 13, 2008 12:25:29 PM MYT> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN>
Stopping PointBase server…
PointBase server stopped.

Well actually there are other errors came out before that but am too lazy to look at it. Lolz.

Now if you try to create a domain, you’ll get the shared library path error:

./config.sh: Don’t know how to set the shared library path for FreeBSD.
Exception in thread “AWT-EventQueue-0″ java.lang.NullPointerException
at java.awt.Container.createHierarchyEvents(Container.java:1366)
at java.awt.Container.createHierarchyEvents(Container.java:1366)
at java.awt.Container.createHierarchyEvents(Container.java:1366)
at java.awt.Container.createHierarchyEvents(Container.java:1366)
at java.awt.Container.addImpl(Container.java:1082)
at java.awt.Container.add(Container.java:903)
at com.bea.plateng.wizard.GUIContext$8.run(GUIContext.java:480)
at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:209)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:597)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:273)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:183)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:173)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:168)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:160)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:121)

Same goes if you try to start a node manager.

Ah well.. think I’ll spare some other time playing around with this. Doing weblogic makes me miss websphere. Lol.

November 17, 2008

Back them up.

Filed under: Tech — Tags: , , , — od @ 9:07 pm

I learned my lesson the hard way. Last 2 years I bought a 120gb WD hard drive and used it for about 6 months before it suddenly died on me. 6 months, it was still covered by the warranty, but I chose not to return it for an exchange of a new one. I have tons of data inside but out of all data, I just need prolly around 30% of it. The rest, I can say goodbye with tears in my eyes. I still have the hard disk, kept nicely inside my wardrobe. When I have the chance, I’ll bring it back to live!

I love Maxtor hard drive. I still have one old Maxtor hdd that I’ve been using for almost 8 years running. In fact I used to have 2 of them. One has failed, but it is completely acceptable since it has served me for over 6 years.

Currently at work am preparing backup scripts for some servers. I find that backing up the entire directory of windows virtual machine is kinda annoying. Am thinking of what would be the best way to shutdown windows without using the force flag? Urgh. I wish I can just init 0.

At home I use a simple script that I wrote to backup directories. Basically what it does is it checks the content of both source and target directories, and any files/directories that are not available at the target location will be copied, creating a duplicate copy of the original directory. It’ll also generate a log (diff.log) to record what files that have been copied over. I use it to backup log files, nothing in giga size.

#!/usr/local/bin/bash

SOURCE=”$HOME/personal/script/cmp2dirs/diffmv/cmp2″
DEST=”$HOME/personal/script/cmp2dirs/diffmv/temp”
sSOURCE=`echo $SOURCE | awk -F/ ‘{print $NF}’`
sDEST=`echo $DEST | awk -F/ ‘{print $NF}’`
DATE=`date +”%v”`

echo “Directories comparison on $DATE” >> diff.log

diff -rq $SOURCE $DEST >> diff.log

s1=”Directories comparison on $DATE”
s2=`sed ‘$!d’ diff.log`

if [ "$s1" != "$s2" ]; then
sed -n ‘/’”$date”‘/,$p’ diff.log > copy.out

echo “Directory(s)/file(s) copied from $sSOURCE to $sDEST:” >> diff.log
f1=`sed -n ‘/^[Only in ]* /s///;s/:.///p’ copy.out`
f2=`sed -n ‘/Files/s/^.*Files.(.*) and.*$/1/p’ copy.out`

IFS=’

cp -vR $f1 $f2 $DEST >> diff.log

else
echo “Directory $sSOURCE is unchanged since yesterday.” >> diff.log
fi

echo >> diff.log
echo “—————————————————————” >> diff.log
echo >> diff.log

exit

The output of diff.log will be something like this:

Directories comparison on 15-Nov-2008
Only in /home/od/personal/script/cmp2dirs/diffmv/cmp2: white space 2
Directory(s)/file(s) copied from cmp2 to temp:
/home/od/personal/script/cmp2dirs/diffmv/cmp2/white space 2 -> ./temp/white space 2
/home/od/personal/script/cmp2dirs/diffmv/cmp2/white space 2/wsp -> ./temp/white space 2/wsp

—————————————————————

Directories comparison on 16-Nov-2008
Directory cmp2 is unchanged since yesterday.

—————————————————————

Directories comparison on 17-Nov-2008
Only in /home/od/personal/script/cmp2dirs/diffmv/cmp2: ghi
Directory(s)/file(s) copied from cmp2 to temp:
/home/od/personal/script/cmp2dirs/diffmv/cmp2/ghi -> ./temp/ghi

—————————————————————

October 14, 2008

HITB 2008.

Filed under: Tech — Tags: , — od @ 10:14 pm

September 23, 2008

DB2 migration across different platforms.

Filed under: Tech — Tags: , , — od @ 2:36 am

So, how do you migrate a database from one server to another server across different platforms? There are a few articles related to this topic on the net, and some of them are very good, but I still stumbled on some problems during the migration process. This is not an expert howto article, do note that I am next to clueless when it comes to database. But yea, this is a howto article so that I remember how I did it the last time, and hopefully, it’ll be of help to anyone who come across to this post.

Basically to migrate a db between 2 servers running on different platforms, you’ll need these 2 awesome utility commands:

DB2MOVE

Use db2move to export all tables and data in PC/IXF format.

The db2move command:
db2move <database-name> <action> [<option> <value>]

DB2LOOK

Use db2look command to extract the DDL statements. What are DDL statements? DDL statements are used to build and modify the structure of your tables and other objects in the database.

The db2look command:

db2look -d <database-name> [<-option1> <-option2> ... <-optionx>]

STEP-BY-STEP example – based on real scenario, with problems encountered, successfully solved.

Scenario:

Migrating DB2 v8.2 ESE on Linux CentOS 5 to DB2 v9.5 on Windows 2003.

BEFORE MIGRATION STARTS

I have 5 databases located on linux running on DB2 ESE version 8.2. This is my first time doing DB2, my first time seeing the databases, so I did the following before I start migrating:

1) Do a full backup for all databases. This is so very important that I think it’s worth mentioning it another 3 times. In caps. – DO A FULL BACKUP FOR ALL DATABASES. DO A FULL BACKUP FOR ALL DATABASES. DO A FULL BACKUP FOR ALL DATABASES.
2) Record down the number of tables listed in each database.
3) Do a full backup for all databases.

MIGRATE

On Linux:

Export the data with the db2move command (no database connection needed). Run the command in a directory meant for each database as it will create a number of IXF files, depends on how huge your database is.

db2move db1 export
db2move db2 export
db2move db3 export
db2move db4 export
db2move db5 export

Generate the DDL statements with the db2look command (no database connection needed).

db2look -e -a -td @ -l -o db1.sql
db2look -e -a -td @ -l -o db2.sql
db2look -e -a -td @ -l -o db3.sql
db2look -e -a -td @ -l -o db4.sql
db2look -e -a -td @ -l -o db5.sql

I didn’t want to use the default delimeter semicolon (;) because am not sure if there are any stored procedures or functions (am not even sure what those are) on the databases. So just to be on the safe side, I used ‘@’ as the termination character instead.

So far, so good.

FTP the files over to the Windows server.

All of the *.ixf files – transfer them in binary mode.
db2move.lst – transfer them in ascii mode.
*.sql (generated by the db2look command) – transfer them in ascii mode.

On Windows:

I already have a DB2 ESE version 9.5 installed, DAS user and instance created (I prefer the names to match with the db running on linux).

Create all the databases that I want to import in.

db2 create db db1
db2 create db db2
db2 create db db3
db2 create db db4
db2 create db db5

Run the script generated by db2look (no database connection needed).

db2 -td@ -vf db1.sql
db2 -td@ -vf db2.sql
db2 -td@ -vf db3.sql
db2 -td@ -vf db4.sql
db2 -td@ -vf db5.sql

Notice that I specified the -l option while running the db2look command, which means it will generate the DDL statements for user-defined table spaces, database partition groups and buffer pools. Check the sql script and change the location path to match the Windows environment before executing them. Something like:

/home/db2inst1/db2inst1/blah/path3/db2inst1_data.tbs’30000 to C:\db2inst1\blah\path3\db2inst1_data.tbs’30000

Else, you’ll get a ‘Bad container path’ error.

I prefer to pipe the result to a file so that I can review it later. Most of the time I wasn’t able to monitor the output since some of the databases are pretty huge and I worked remotely with a lousy, lousy network connection (I love rdesktop for this).

By this time, my databases contain all the tables as the original databases on linux do. But of course, they’re all empty.

Normally, there shouldn’t be any problems until you come to the data loading part (no database connection needed).

db2move db1 load
db2move db2 load
db2move db3 load
db2move db4 load
db2move db5 load

db2move utility will also create an output file based on the action that you specified (in my case, it’s LOAD.out), so I don’t have to bother piping the result to a file.

If this part ended successfully, you’re all done. Unfortunately for me, there are warnings inside the LOAD.out files. I have 5 LOAD.out files altogether, and 4 of them contain the same warning code:

* LOAD: table “DB2INST1″.”RQVIEWS”
*** WARNING 3107. Check message file tab52.msg!
*** SQL Warning! SQLCODE is 3107
*** SQL3107W There is at least one warning message in the message file.

So what’s in tab52.msg?

SQL3229W The field value in row “1″ and column “9″ is invalid. The row was
rejected. Reason code: “1″.

SQL3185W The previous error occurred while processing data from row “1″ of
the input file.

SQL3229W The field value in row “2″ and column “9″ is invalid. The row was
rejected. Reason code: “1″.

SQL3185W The previous error occurred while processing data from row “2″ of
the input file.

SQL3229W The field value in row “3″ and column “9″ is invalid. The row was
rejected. Reason code: “1″.

SQL3185W The previous error occurred while processing data from row “3″ of
the input file.

SQL3229W The field value in row “4″ and column “9″ is invalid. The row was
rejected. Reason code: “1″.

Data type mismatch? To be frank, I don’t know, but as I reviewed back the db2move options, there’s one that I have probably missed.

-l lobpaths

LOB stands for Large OBject. A large object (LOB) is a string data type with a size ranging from 0 bytes to 2 GB (GB equals 1 073 741 824 bytes).

So, if you know where your lobs are, specify this option while exporting the data, and make sure to check that you have files with names similar to this when you’re done.

tab52a.001.lob

Being a complete noob in the world of db, I don’t know where the lobs are. In fact, I don’t even know what it means the first time I encountered it (no wonder I purposely ignored the -l option in the first place lol). So, I decided to export the db on linux once again and dump it straight to Windows, on the fly. This way, even without specifying the -l option, it will export your LOBs as well. Nice.

On Windows, I dropped all the databases that I’ve created since I prefer to have a fresh start. Now all I have to do is access the databases on linux remotely from my db2 on Windows.

db2 catalog tcpip node dbonlinux remote 10.8.8.230 server 50000

dbonlinux – an arbitrary name for the node I created.
10.8.8.230 – IP address of the linux(remote) server.
50000 – the iiimsf port used. This is the default port.

db2 catalog db db1 at node dbonlinux
db2 catalog db db2 at node dbonlinux
db2 catalog db db3 at node dbonlinux
db2 catalog db db4 at node dbonlinux
db2 catalog db db5 at node dbonlinux

db2 terminate

Now I can connect to my linux db remotely from the Windows server by using this command:

db2 connect to db1 user db_username using db_password
db2 connect to db2 user db_username using db_password
db2 connect to db3 user db_username using db_password
db2 connect to db4 user db_username using db_password
db2 connect to db5 user db_username using db_password

If you failed to connect, check if you’re using the correct port.

To check which port to be used on the server that you wish to access to:

1) db2 dbm cfg | grep SVCENAME

Most of the time it’ll return the service name instead of the port, so find the port number by the service name from the services file.

Now that am successfully connected, I run again the db2move command.

db2move db1 export

And I did the same with the rest of the 4 databases. This time when I checked, LOBs are exported as well. Coolness.

Remember to disconnect from the database that you’ve accessed remotely. You wouldn’t want to mess with the production database. As for me, I won’t be needing to access the remote database again, so I removed the database alias and the node I’ve created.

db2 uncatalog db db1
db2 uncatalog db db2
db2 uncatalog db db3
db2 uncatalog db db4
db2 uncatalog db db5
db2 uncatalog node dbonlinux

Create all the 5 databases again with db2 create db <database_name> command.

I ran again the sql script generated by db2look, and load the data using db2move command and that’s it, I’m done.

But, am not so lucky. Only 3 out of 5 databases were managed to be exported successfully without any errors. To be honest, am pretty devastated at this point.

Further checking revealed that during the execution of the sql script generated by db2look, the table spaces were not created because of bad container path. I was completely dumbfounded because the container path was good, seriously. Aargghhhhhhhhhhhhhhhhhh! I’ve decided to proceed without the table spaces and create them manually afterwards.

All these while I’ve been doing db2move in load mode. With db2move <db_name> load, you will have to have the tables created on the database first, else, you’ll receive tons of errors. With import, you don’t. So, for the databases that I’ve failed to load the data in, I did import instead. Again, I dropped the databases and recreate them for a clean start.

db2move db1 import
db2move db2 import

Success. Cool.

Now that the tables are all imported, I created the necessary table spaces manually, matched the names listed in the sql script generated by db2look file.

Run the sql script generated by db2look.

I’m DONE!

And that’s what I thought. Bleargh.

Well ok, 95% I’m done, with all the exporting and loading, which is the crucial part anyways.

VERIFYING INTEGRITY

The final part is to check the integrity of the migrated database.

When I first select * from table_name I encountered this error:

SQL0668N Operation not allowed for reason code “1″ on table blah.db1. SQLSTATE=57016

More info at https://publib.boulder.ibm.com/infocenter/db2luw/v9r5/index.jsp?topic=/com.ibm.db2.luw.messages.sql.doc/doc/msql00668n.html

Run the following command and all’s good:

db2 set integrity for <table_name> immediate checked

To check the which tables are in pending state, run the following command:

db2 select tabname from syscat.tables where status=’C’

The output is a list of tables that requires the execution of the set integrity statement. It’ll be lovely to have a script or a single command that can set the integrity on the affected tables, rather than doing it one by one for each table.

Yea, I’m DONE :D

Hope I didn’t miss out anything.
Recommended readings:

Using DB2 utilities to clone databases across different platforms

DB2 Version 8 Connectivity Cheat Sheet

DB2 Backup Basics

DB2 Backup Basics – Part 2

DB2 Backup Basics – Part 3

September 22, 2008

Production server – ain’t no playground.

Filed under: Tech — Tags: , , — od @ 11:10 pm

Screwing up a production server is a nightmare, especially when it involves database. Well actually, it doesn’t matter. As long as it’s a production server, it is a nightmare. It’s just too scary that I had sleepless nights during the weekends that I’ve been drooling over the weekdays.

I am not a database expert, so when my team leader came to me and asked me to migrate the production database from Windows to Linux because the db admin has already resigned, I did fret a bit. So I started doing some quick research on db2, but in the end, one silly mistake I did brought the entire database down. As well as the system that relies on it. As well as myself. I was down.

I thanked myself for doing an offline backup before started with all the migration. Am not gonna nag on why all these while the production server is running without a single backup, why no one is telling me that the server is in used on a daily basis. Yes, it is a production server, but since I was allowed to do the job during office hour, my assumption was it has been put on hold so that no one will be using it during bright daylight. Else, I would’ve considered doing an online backup.

That was an experience that I will never forget. And this brings me to my next post.

Older Posts »

Blog at WordPress.com.