sixtydoses. where od is harmless.

May 23, 2009

Edge Load Balancer Network Dispatcher – Double Collocated HA on HP-UX.

Filed under: Tech — Tags: , , , , — od @ 5:01 am

One of my recent project was to configure Edge load balancer on 2 servers in high availability (HA) environment. I rarely do Edge, but the configuration is pretty straightforward. In my past projects, Edge implementation has always been in separate boxes, which is easier compared to collocated setup. In this post I’m going to share my configuration for edge dispatcher (MAC forwarding) that resides together with web server (I’m using IHS) and WebSphere. Each server will use 1 IP address for both web server and dispatcher. The configuration is almost the same, but there were few issues that I encountered and I hope this post will be of help to those who are dealing with Edge dispatcher as well.

For typical setup of Edge load balancer servers that do not reside in the same box with web servers, the general rules are:
– Primary Edge – cluster IP aliased to its NIC.
– Standby Edge – cluster IP aliased to its loopback.
– Web Servers – cluster IP aliased to loopback.

These rules hold the same in collocated environment:
– Primary Edge – cluster IP aliased to its NIC.
– Standby Edge – cluster IP aliased to its loopback.

Collocated Edge.

Double collocated HA edge.

Say I have the following:
Cluster IP – 192.168.10.10
Cluster port – 8080
Primary Edge – 192.168.10.20
Backup Edge – 192.168.10.21


default.cfg for Primary Edge:

dscontrol set loglevel 5
dscontrol set logsize 50000000
dscontrol executor start

dscontrol executor set nfa 192.168.10.20

dscontrol highavailability heartbeat add 192.168.10.20 192.168.10.21
dscontrol highavailability backup add primary auto 8880
dscontrol highavailability reach add 192.168.10.55
dscontrol highavailability reach add 192.168.10.56

dscontrol cluster add 192.168.10.10
dscontrol port add 192.168.10.10:8080

dscontrol server add 192.168.10.10:8080:192.168.10.20
dscontrol server add 192.168.10.10:8080:192.168.10.21

dscontrol manager start manager.log 10004
dscontrol man reach set loglevel 5
dscontrol man reach set logsize 50000000
dscontrol advisor start Http 192.168.10.10:8080 Http_192.168.10.10_8080.log



default.cfg for Standby Edge:

dscontrol set loglevel 5
dscontrol set logsize 50000000

dscontrol executor start

dscontrol executor set nfa 192.168.10.21

dscontrol highavailability heartbeat add 192.168.10.21 192.168.10.20
dscontrol highavailability backup add backup auto 8880
dscontrol highavailability reach add 192.168.10.55
dscontrol highavailability reach add 192.168.10.56

dscontrol cluster add 192.168.10.10
dscontrol port add 192.168.10.10:8080

dscontrol server add 192.168.10.10:8080:192.168.10.21
dscontrol server add 192.168.10.10:8080:192.168.10.20

dscontrol manager start manager.log 10004
dscontrol man reach set loglevel 5
dscontrol man reach set logsize 50000000
dscontrol advisor start Http 192.168.10.10:8080 Http_192.168.10.10_8080.log



goActive script:

This script will remove the cluster IP from loopback and alias it to the NIC.

#!/bin/ksh

CLUSTER=192.168.10.10
LOOPBACK=lo0:1

ifconfig $LOOPBACK 0.0.0.0
dscontrol executor configure $CLUSTER



goStandby script:
This script will remove the cluster IP from NIC and alias it to the loopback.

#!/bin/ksh

LOOPBACK=lo0:1
CLUSTER=192.168.10.10
NETMASK=255.255.255.192

dscontrol executor unconfigure $CLUSTER
ifconfig $LOOPBACK $CLUSTER netmask $NETMASK up



goInOp script:
This script will remove the cluster IP from all devices (loopback and NIC).

#!/bin/ksh

CLUSTER=192.168.10.10
NETMASK=255.255.255.192

dscontrol executor unconfigure $CLUSTER
ifconfig $LOOPBACK $CLUSTER netmask $NETMASK down



The normal method to test if the high availability works smoothly is by plugging out the network cable off the edge server. I would tail the root mail (/var/mail/root) at the same time, so I could see which HA script has been triggered when the network is interrupted. Another method is to bring down the server, by rebooting it or shutting it down. With reboot you’ll only have a short time span to monitor the failover in action, but of course this depends on how long your servers take to start up.

But since this is a collocated environment, if I were to opt for either the described testing methods, I wouldn’t be able to see if the dispatcher balances all requests to both web servers accordingly (in my case I’m using the round robin algorithm). So what I did is, I manually stop the executor so that failover occurs. Note that stopping the dsserver alone won’t trigger the HA scripts. Actually it is not necessary to stop the dsserver. Well to be honest even if it’s not a collocated environment, I normally test the HA failover by stopping the executor, since normally am working remotely and plugging out the cable requires me to get the help of the sys admins. So might as well test if its really working before going through all the hassle.

One of the problem that I encountered was instability. Sometimes the dispatcher will run in the right mode (active | standby), but most of the time both will run as active. It was very unstable, no certain pattern that I could track. Even worse, sometimes when I tried ro run the dispatcher as a standalone lb, all of the incoming requests will be routed directly to the web server, skipping the dispatcher completely. I was stuck with this problem for several days when I finally figured out what the culprit is.

The ibmlb module.

Everytime when the executor is stopped, the ibmlb module will be unloaded. Everytime when the executor starts, the ibmlb module will be loaded to the kernel. I’m lucky that I have dmesg on both servers, so based from dmesg, this is how it should looked like whenever you stop and start the executor:

ibmlb DLKM successfully unloaded
ibmlb DLKM successfully loaded

But what happened was, when I stopped the executor, the ibmlb was not unloaded. The status was busy, and I’ll have to unload the module explicitly.

ibmlb DLKM successfully unloaded
ibmlb DLKM successfully loaded
ibmlb version is 06.01.00.00 – 20060515-232359 [wsbld265]
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb

I’ve not seen anything like this before (I used to configure dispatcher on AIX servers). Consider the following test cases (arp table checked from a different server that resides on the same segment):

TEST 1.

1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is UNLOADED successfully on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup goes standby. Cluster IP belongs to Primary.

TEST 2.
1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is busy and still LOADED on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup stays active. Cluster IP belongs to Primary, but all requests will skip dispatcher and go straight to the web server.


TEST 3.

1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is UNLOADED successfully on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup goes standby. Cluster IP belongs to Primary.
4) Backup down. Module ibmlb is UNLOADED successfully.
5) Backup up, running in standby mode.
6) Backup down. Module ibmlb is busy and still LOADED on backup.
7) Backup up, running in active mode (remember that Primary is also in active mode too). Cluster IP belongs to Backup, but all requests will skip the dispatcher and go straight to the web server.
8 ) Backup down. Module ibmlb is busy and still LOADED on backup. Explicitly unload the module using kcmodule command until it gets UNLOADED. Cluster IP belongs to Primary.
9) Backup up, running in standby mode.

Most of the time I won’t be able to unload it right away, until I let the server ‘rest’ for about 15 – 20 minutes, before trying to unload it again. Rebooting the server will always solve this problem (the module next state is unused). Am not sure if there’s a way to force a module to be unloaded though. As far as I know there’s no force flag for kcmodule.

I was fooled several times since I tested the splash page of the web servers from my Opera browser. I was on a different subnet, so I guess there must be a switch/router in between me and the edge servers. At times, even when the cluster IP is aliased to the Primary Edge, my browser will point to the Backup Edge since the ARP cache was not refreshed. It was so annoying since this will affect the cluster report. The rest of the testings were done by running a browser from a different server but belongs to the same subnet. At least I could clear up the ARP cache manually if I have to.

Okay probably this is my browser problem, but testing the splash page with Firefox sucks. It kept on hitting the splash page even after I’ve stopped both web servers, and cleared up the cache. It was alright with Opera though. What gives?

By the way I’m using Edge v6.1. If you check out the Edge Fixpack page here, you’ll notice that there is no patch for HP-UX. Not a single patch. Is IBM trying to say something? Don’t use Edge on HP-UX, perhaps? Lol. Anyway, IBM packed me a patch (6.1.0.35), but still it didn’t address the module issue. Am not sure if I could call it a patch though, it’s more like an installer since I had to reinstall everything.

Thanks to Robert Brown from IBM for assisting me on this ‘false alarm’ panic attack (initially I thought it was a network issue).

September 23, 2008

DB2 migration across different platforms.

Filed under: Tech — Tags: , , — od @ 2:36 am

So, how do you migrate a database from one server to another server across different platforms? There are a few articles related to this topic on the net, and some of them are very good, but I still stumbled on some problems during the migration process. This is not an expert howto article, do note that I am next to clueless when it comes to database. But yea, this is a howto article so that I remember how I did it the last time, and hopefully, it’ll be of help to anyone who come across to this post.

Basically to migrate a db between 2 servers running on different platforms, you’ll need these 2 awesome utility commands:

DB2MOVE

Use db2move to export all tables and data in PC/IXF format.

The db2move command:
db2move <database-name> <action> [<option> <value>]

DB2LOOK

Use db2look command to extract the DDL statements. What are DDL statements? DDL statements are used to build and modify the structure of your tables and other objects in the database.

The db2look command:

db2look -d <database-name> [<-option1> <-option2> … <-optionx>]

STEP-BY-STEP example – based on real scenario, with problems encountered, successfully solved.

Scenario:

Migrating DB2 v8.2 ESE on Linux CentOS 5 to DB2 v9.5 on Windows 2003.

BEFORE MIGRATION STARTS

I have 5 databases located on linux running on DB2 ESE version 8.2. This is my first time doing DB2, my first time seeing the databases, so I did the following before I start migrating:

1) Do a full backup for all databases. This is so very important that I think it’s worth mentioning it another 3 times. In caps. – DO A FULL BACKUP FOR ALL DATABASES. DO A FULL BACKUP FOR ALL DATABASES. DO A FULL BACKUP FOR ALL DATABASES.
2) Record down the number of tables listed in each database.
3) Do a full backup for all databases.

MIGRATE

On Linux:

Export the data with the db2move command (no database connection needed). Run the command in a directory meant for each database as it will create a number of IXF files, depends on how huge your database is.

db2move db1 export
db2move db2 export
db2move db3 export
db2move db4 export
db2move db5 export

Generate the DDL statements with the db2look command (no database connection needed).

db2look -e -a -td @ -l -o db1.sql
db2look -e -a -td @ -l -o db2.sql
db2look -e -a -td @ -l -o db3.sql
db2look -e -a -td @ -l -o db4.sql
db2look -e -a -td @ -l -o db5.sql

I didn’t want to use the default delimeter semicolon (;) because am not sure if there are any stored procedures or functions (am not even sure what those are) on the databases. So just to be on the safe side, I used ‘@’ as the termination character instead.

So far, so good.

FTP the files over to the Windows server.

All of the *.ixf files – transfer them in binary mode.
db2move.lst – transfer them in ascii mode.
*.sql (generated by the db2look command) – transfer them in ascii mode.

On Windows:

I already have a DB2 ESE version 9.5 installed, DAS user and instance created (I prefer the names to match with the db running on linux).

Create all the databases that I want to import in.

db2 create db db1
db2 create db db2
db2 create db db3
db2 create db db4
db2 create db db5

Run the script generated by db2look (no database connection needed).

db2 -td@ -vf db1.sql
db2 -td@ -vf db2.sql
db2 -td@ -vf db3.sql
db2 -td@ -vf db4.sql
db2 -td@ -vf db5.sql

Notice that I specified the -l option while running the db2look command, which means it will generate the DDL statements for user-defined table spaces, database partition groups and buffer pools. Check the sql script and change the location path to match the Windows environment before executing them. Something like:

/home/db2inst1/db2inst1/blah/path3/db2inst1_data.tbs’30000 to C:\db2inst1\blah\path3\db2inst1_data.tbs’30000

Else, you’ll get a ‘Bad container path’ error.

I prefer to pipe the result to a file so that I can review it later. Most of the time I wasn’t able to monitor the output since some of the databases are pretty huge and I worked remotely with a lousy, lousy network connection (I love rdesktop for this).

By this time, my databases contain all the tables as the original databases on linux do. But of course, they’re all empty.

Normally, there shouldn’t be any problems until you come to the data loading part (no database connection needed).

db2move db1 load
db2move db2 load
db2move db3 load
db2move db4 load
db2move db5 load

db2move utility will also create an output file based on the action that you specified (in my case, it’s LOAD.out), so I don’t have to bother piping the result to a file.

If this part ended successfully, you’re all done. Unfortunately for me, there are warnings inside the LOAD.out files. I have 5 LOAD.out files altogether, and 4 of them contain the same warning code:

* LOAD: table “DB2INST1″.”RQVIEWS”
*** WARNING 3107. Check message file tab52.msg!
*** SQL Warning! SQLCODE is 3107
*** SQL3107W There is at least one warning message in the message file.

So what’s in tab52.msg?

SQL3229W The field value in row “1” and column “9” is invalid. The row was
rejected. Reason code: “1”.

SQL3185W The previous error occurred while processing data from row “1” of
the input file.

SQL3229W The field value in row “2” and column “9” is invalid. The row was
rejected. Reason code: “1”.

SQL3185W The previous error occurred while processing data from row “2” of
the input file.

SQL3229W The field value in row “3” and column “9” is invalid. The row was
rejected. Reason code: “1”.

SQL3185W The previous error occurred while processing data from row “3” of
the input file.

SQL3229W The field value in row “4” and column “9” is invalid. The row was
rejected. Reason code: “1”.

Data type mismatch? To be frank, I don’t know, but as I reviewed back the db2move options, there’s one that I have probably missed.

-l lobpaths

LOB stands for Large OBject. A large object (LOB) is a string data type with a size ranging from 0 bytes to 2 GB (GB equals 1 073 741 824 bytes).

So, if you know where your lobs are, specify this option while exporting the data, and make sure to check that you have files with names similar to this when you’re done.

tab52a.001.lob

Being a complete noob in the world of db, I don’t know where the lobs are. In fact, I don’t even know what it means the first time I encountered it (no wonder I purposely ignored the -l option in the first place lol). So, I decided to export the db on linux once again and dump it straight to Windows, on the fly. This way, even without specifying the -l option, it will export your LOBs as well. Nice.

On Windows, I dropped all the databases that I’ve created since I prefer to have a fresh start. Now all I have to do is access the databases on linux remotely from my db2 on Windows.

db2 catalog tcpip node dbonlinux remote 10.8.8.230 server 50000

dbonlinux – an arbitrary name for the node I created.
10.8.8.230 – IP address of the linux(remote) server.
50000 – the iiimsf port used. This is the default port.

db2 catalog db db1 at node dbonlinux
db2 catalog db db2 at node dbonlinux
db2 catalog db db3 at node dbonlinux
db2 catalog db db4 at node dbonlinux
db2 catalog db db5 at node dbonlinux

db2 terminate

Now I can connect to my linux db remotely from the Windows server by using this command:

db2 connect to db1 user db_username using db_password
db2 connect to db2 user db_username using db_password
db2 connect to db3 user db_username using db_password
db2 connect to db4 user db_username using db_password
db2 connect to db5 user db_username using db_password

If you failed to connect, check if you’re using the correct port.

To check which port to be used on the server that you wish to access to:

1) db2 dbm cfg | grep SVCENAME

Most of the time it’ll return the service name instead of the port, so find the port number by the service name from the services file.

Now that am successfully connected, I run again the db2move command.

db2move db1 export

And I did the same with the rest of the 4 databases. This time when I checked, LOBs are exported as well. Coolness.

Remember to disconnect from the database that you’ve accessed remotely. You wouldn’t want to mess with the production database. As for me, I won’t be needing to access the remote database again, so I removed the database alias and the node I’ve created.

db2 uncatalog db db1
db2 uncatalog db db2
db2 uncatalog db db3
db2 uncatalog db db4
db2 uncatalog db db5
db2 uncatalog node dbonlinux

Create all the 5 databases again with db2 create db <database_name> command.

I ran again the sql script generated by db2look, and load the data using db2move command and that’s it, I’m done.

But, am not so lucky. Only 3 out of 5 databases were managed to be exported successfully without any errors. To be honest, am pretty devastated at this point.

Further checking revealed that during the execution of the sql script generated by db2look, the table spaces were not created because of bad container path. I was completely dumbfounded because the container path was good, seriously. Aargghhhhhhhhhhhhhhhhhh! I’ve decided to proceed without the table spaces and create them manually afterwards.

All these while I’ve been doing db2move in load mode. With db2move <db_name> load, you will have to have the tables created on the database first, else, you’ll receive tons of errors. With import, you don’t. So, for the databases that I’ve failed to load the data in, I did import instead. Again, I dropped the databases and recreate them for a clean start.

db2move db1 import
db2move db2 import

Success. Cool.

Now that the tables are all imported, I created the necessary table spaces manually, matched the names listed in the sql script generated by db2look file.

Run the sql script generated by db2look.

I’m DONE!

And that’s what I thought. Bleargh.

Well ok, 95% I’m done, with all the exporting and loading, which is the crucial part anyways.

VERIFYING INTEGRITY

The final part is to check the integrity of the migrated database.

When I first select * from table_name I encountered this error:

SQL0668N Operation not allowed for reason code “1” on table blah.db1. SQLSTATE=57016

More info at https://publib.boulder.ibm.com/infocenter/db2luw/v9r5/index.jsp?topic=/com.ibm.db2.luw.messages.sql.doc/doc/msql00668n.html

Run the following command and all’s good:

db2 set integrity for <table_name> immediate checked

To check the which tables are in pending state, run the following command:

db2 select tabname from syscat.tables where status=’C’

The output is a list of tables that requires the execution of the set integrity statement. It’ll be lovely to have a script or a single command that can set the integrity on the affected tables, rather than doing it one by one for each table.

Yea, I’m DONE 😀

Hope I didn’t miss out anything.
Recommended readings:

Using DB2 utilities to clone databases across different platforms

DB2 Version 8 Connectivity Cheat Sheet

DB2 Backup Basics

DB2 Backup Basics – Part 2

DB2 Backup Basics – Part 3

September 22, 2008

Production server – ain’t no playground.

Filed under: Tech — Tags: , , — od @ 11:10 pm

Screwing up a production server is a nightmare, especially when it involves database. Well actually, it doesn’t matter. As long as it’s a production server, it is a nightmare. It’s just too scary that I had sleepless nights during the weekends that I’ve been drooling over the weekdays.

I am not a database expert, so when my team leader came to me and asked me to migrate the production database from Windows to Linux because the db admin has already resigned, I did fret a bit. So I started doing some quick research on db2, but in the end, one silly mistake I did brought the entire database down. As well as the system that relies on it. As well as myself. I was down.

I thanked myself for doing an offline backup before started with all the migration. Am not gonna nag on why all these while the production server is running without a single backup, why no one is telling me that the server is in used on a daily basis. Yes, it is a production server, but since I was allowed to do the job during office hour, my assumption was it has been put on hold so that no one will be using it during bright daylight. Else, I would’ve considered doing an online backup.

That was an experience that I will never forget. And this brings me to my next post.

March 20, 2008

File name truncated – huh?

Filed under: Tech — Tags: , — od @ 5:58 pm

I was installing WAS v6.0.2 on a HP-UX B.11.31 IA-64 server and had a problem while trying to install the Update Installer for fixpack 25. Now again, the problem did not occur during the patching, but during the installation of the Update Installer itself.

Since WAS v6.0.2.21, IBM decided to separate the update installer package from the fix pack. This is great since this can save up time as it avoids the redundancy of downloading the update installer which normally can be used across the similar product version. But now one must be sure that the he/she has the correct version of update installer to match with the fixpack to be installed – IBM takes care of this with ‘FIX CENTRAL’. As for me, I’m sure I have the correct one.

Now back to the problem with the update installer. Based on the logs it stated that there is a missing file – no such file or directory. Well actually there are more than just one missing file, but the installation will stop each time it failed to find a particular file, and on the next re-run it will detect another missing file and so on. When I went through the directories, I figured that there are few files with their names truncated. So that explains the error of the missing files.

I untared the same file on my ubuntu, checked the file names and they are all in perfect condition. Erms. I’m confused. Why and how were the names got truncated? Probably during the sftp of the installer from my lappie to the server?

A screenshot that says it all:

WAS err.

On a different note, I think it would be nice if IBM could provide a checksum of all the files available for download so that I could just simply check if the files that I downloaded are not corrupted. At the moment I’m keeping a list of my own md5 checksum of all installers that I have downloaded.

Another thing that bothers me is that sometimes I just don’t understand the Download Director that I use to download IBM softwares. While I like it more than http download, I am so confused with the ETA. How do you define a negative value of an ETA? This normally happened when I download multiple files at a time. Gah.

Download director.

March 14, 2008

Install WAS Base/ND v6.1.0 on Ubuntu Gutsy.

Filed under: Tech — Tags: , , , — od @ 2:07 pm

There are 2 things that need to be configured in order to install WebSphere Application Server Base/ND on Ubuntu Gutsy successfully – tested using WAS Base/ND v6.1.

1 – Ubuntu Gutsy links sh to dash instead of bash. There won’t be any error during the installation of WAS itself, but you will not be able to create any profile, so it’s useless. Two ways to fix this, either remove the symlink and relink it to bash, or change the shebang line inside the WAS install script from #!/bin/sh to #!/bin/bash. Changing the default shell from dash to bash may cause your system slower since dash is lighter than bash, but I think it is hardly noticeable. More info at https://wiki.ubuntu.com/DashAsBinSh.

2 – This applies to WAS ND, I didn’t encounter any issue with Base. If you’re having a problem in getting the dmgr server up, and the error in the SystemOut.log is something like this:

[3/12/08 15:38:06:539 MYT] 0000000a LogAdapter E DCSV9403E: Received an illegal configuration argument. Parameter
MulticastInterface, value: 127.0.1.1. Exception is java.lang.Exception: Network Interface 127.0.1.1 was not found in
local machine network interface list. Make sure that the NetworkInterface property is properly configured!
at com.ibm.rmm.mtl.transmitter.Config.<init>(Config.java:238)
at com.ibm.rmm.mtl.transmitter.MTransmitter.<init>(MTransmitter.java:192)
at com.ibm.rmm.mtl.transmitter.MTransmitter.getInstance(MTransmitter.java:406)
at com.ibm.rmm.mtl.transmitter.MTransmitter.getInstance(MTransmitter.java:345)
at com.ibm.htmt.rmm.RMM.getInstance(RMM.java:128)
at com.ibm.htmt.rmm.RMM.getInstance(RMM.java:189)
at com.ibm.ws.dcs.vri.transportAdapter.rmmImpl.rmmAdapter.RmmAdapter.<init>(RmmAdapter.java:218)
at com.ibm.ws.dcs.vri.transportAdapter.rmmImpl.rmmAdapter.MbuRmmAdapter.<init>(MbuRmmAdapter.java:76)
at com.ibm.ws.dcs.vri.transportAdapter.rmmImpl.rmmAdapter.RmmAdapter.getInstance(RmmAdapter.java:133)
at com.ibm.ws.dcs.vri.transportAdapter.TransportAdapter.getInstance(TransportAdapter.java:161)
at com.ibm.ws.dcs.vri.common.impl.DCSCoreStackImpl.<init>(DCSCoreStackImpl.java:178)
at com.ibm.ws.dcs.vri.common.impl.DCSCoreStackImpl.getInstance(DCSCoreStackImpl.java:167)
at com.ibm.ws.dcs.vri.common.impl.DCSStackFactory.getCoreStack(DCSStackFactory.java:92)
at com.ibm.ws.dcs.vri.DCSImpl.getCoreStack(DCSImpl.java:84)
at com.ibm.ws.hamanager.coordinator.impl.DCSPluginImpl.<init>(DCSPluginImpl.java:238)
at com.ibm.ws.hamanager.coordinator.impl.CoordinatorImpl.<init>(CoordinatorImpl.java:322)
at com.ibm.ws.hamanager.coordinator.corestack.CoreStackFactoryImpl.createDefaultCoreStack(CoreStackFactoryImpl
.java:82)

Chance is you have not assigned an IP address for your hostname, except for the default 127.* address. If this is the case you won’t be able to federate nodes to the Dmgr as well. So edit your hosts file. Since Edgy the hostname was split to 127.0.1.1, so you will see 127.0.0.1 is assigned to a localhost, and 127.0.1.1 to your hostname. Assign your hostname to 127.0.0.1 as well, and problem solved. But if you plan to do some nodes federation, then assign an IP for your hostname. Your hosts file should look something like this:

127.0.0.1 localhost YourHostName
127.0.1.1 YourHostName

Done.