使用CloudXNS API实现CNAME域名主备解析故障自动切换

背景

时间也不早了,言简意赅介绍一下这篇文章产生的背景。我在架设支持负载均衡和HA的shadowsocks/pac代理一文中曾经提到过,使用haproxy来做负载均衡,并基于DNS做调度,但是那篇文章中的一个问题是,cloudxns官方只支持对域名A记录的主备IP进行监控和智能解析切换,对于域名的CNAME记录并不支持。

需求

我当前使用的情况是:对于域名记录 vpn.tyr.gift默认线路解析到位于 北京 的cubieboard,而华东线路访问时则解析到 南京 的openwrt上,也就是希望默认线路在无故障时解析到cubieboard上,但是cubieboard挂了时(当前我经常遇到)自动解析到openwrt上,对于华东线路也与此类似。

所以,本文章中的code主要完成如下功能:

  1. 监控多个节点的工作状态
  2. 如果某个节点视为down,则修改解析记录切换到其他节点
  3. 当节点恢复正常时,将解析记录还原到原节点
  4. 此即‘智能解析切换’

实现

CloudXNS API

CloudXNS 支持Restful API,对于我们的需求,需要调用三个API,分别是:获取域名列表获取解析记录列表新增解析记录 ,通过调用前两者,我们可以获取需要修改域名的domain_id和域名记录的record_id, 再通过调用后者接口并传入record_id来修改对应的域名记录。

最新版的API文档参见这里, 对于API的详细解释和调用请参考官方文档,这里不做过多解释。

headers

API调用时必须要带上相应的http headers,实现如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def get_headers(URL, BODY):
API_KEY = 'xxxxxxxxxxxxxxxxxxx'
SECRET_KEY = 'xxxxxxxxxxx'
# Wed, 24 Dec 2014 08:26:21 +0800
API_REQUEST_DATE= time.strftime("%a, %d %b %Y %H:%M:%S +0800", time.localtime())
API_FORMAT = 'json'
API_HMAC = hashlib.md5(API_KEY + URL + BODY + API_REQUEST_DATE + SECRET_KEY).hexdigest()
headers = {
'API-KEY': API_KEY,
'API-REQUEST-DATE': API_REQUEST_DATE,
'API-HMAC': API_HMAC,
'API-FORMAT': API_FORMAT,
}
return headers

获取域名列表

进行这步,这是为了获取后文需要的domain_id。

1
2
3
4
5
6
7
8
def get_domain():
url = 'https://www.cloudxns.net/api2/domain'
body = ''
req = urllib2.Request(url, headers=get_headers(url, body))
resp = urllib2.urlopen(req)
return resp.read()

返回的值如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"code": 1,
"message": "success",
"total": "1",
"data": [{
"id": "72348",
"domain": "tyr.gift.",
"status": "ok",
"level": "3",
"take_over_status": "ok",
"create_time": "2015-03-15 15:09:29",
"update_time": "2016-09-22 00:00:00",
"ttl": "600"
}]
}

这里我们需要记录的值为domain_id: 72348

获取解析记录

我们需要关注的是,需要修改域名记录的record_id

1
2
3
4
5
6
7
8
def get_dns_entry():
domain_id = '72348'
url = 'https://www.cloudxns.net/api2/record/%s?host_id=0' % domain_id
body = ''
req = urllib2.Request(url, headers=get_headers(url, body))
resp = urllib2.urlopen(req)
return resp.read()

domain_id 中填入上步中获取的、我们需要修改的域名id。

返回值截取部分如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
"record_id": "182492823",
"host_id": "23820374",
"host": "vpn",
"line_id": "6",
"line_zh": "\u6559\u80b2\u7f51\u9ed8\u8ba4",
"line_en": "CER",
"mx": null,
"value": "my-value-a@tyr.gift.",
"ttl": "600",
"type": "LINK",
"status": "ok",
"create_time": "2016-06-11 22:19:43",
"update_time": "2016-09-21 21:49:32"
},
{
"record_id": "192831332",
"host_id": "29384138",
"host": "vpn",
"line_id": "1",
"line_zh": "\u5168\u7f51\u9ed8\u8ba4",
"line_en": "DEFAULT",
"mx": null,
"value": "my-value-b@tyr.gift.",
"ttl": "600",
"type": "LINK",
"status": "ok",
"create_time": "2016-04-15 15:18:29",
"update_time": "2016-07-10 00:43:42"
}

我们需要挑选出我们需要修改记录的record_id,这里即为182492823192831332,对应着华东线路和默认线路。

修改解析记录

修改部分如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def alter_resolve(record_id, domain_id, host, value, ttl, type):
url = 'https://www.cloudxns.net/api2/record/%s' % record_id
body = {
'domain_id': domain_id,
'host': host,
'value': value,
'type': type,
}
body = json.dumps(body)
r = requests.put(url, body, headers=get_headers(url,body))
return r

如果修改成功,会返回如下信息:

1
2
3
4
5
6
7
8
9
{
"code":1,
"message":" success",
"data":{
"id":63389,
"domain_name":"x.1s45test.com.",
"value":"9.2.4.3"
}
}

通过判断r.status_code和返回值中的message字段,可确定是否修改成功。

智能解析切换

在这部分,实现两个功能:

  1. 通过GET方式,判断cubieboard和openwrt是否工作正常
  2. 如果不正常,则修改域名解析记录,指向当前正常的节点

判断是否存活

判断是否可以进行HTTP Get,以及是否正确获取到相应内容。

1
2
3
4
5
6
7
8
9
10
11
12
13
def bj_proxy_alive():
url = 'http://bj.cubieboard.vpn.tyr.gift:8288'
try:
resp = urllib2.urlopen(url)
resp = resp.read()
if "TyrChen's notes" in resp:
return True
else:
return False
except Exception as e:
print e
return False

域名监控和解析切换

这部分的功能有:

  1. 周期性判断BJ和NJ节点是否工作正常,如果不正常则记录
  2. 当探测失败超过一定次数时,视为节点down,触发解析切换和log通知
  3. 当节点恢复正常时,再次进行自行切换并通知
  4. 对BJ和NJ的节点都进行监控和智能解析监控

主要代码如下。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
INTER = 1
_BJ_ON2DOWN = ''
_NJ_ON2DOWN = ''
retry = 4
_of_bj = 0
_of_nj = 0
while True:
ti = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
if bj_proxy_alive():
''' BJ节点正常 '''
print '%s bj_proxy_alive: %s' %(ti, 'ok')
_bj_new = 'Up'
if _BJ_ON2DOWN != _bj_new:
''' BJ节点由down -> up '''
sendmsg('%s bj proxy is up!' % ti)
_BJ_ON2DOWN = _bj_new
if _DEFAULT2BJ():
''' 默认线路切换到BJ节点 '''
msg = 'Default Line Proxy Switch To BJ Success!'
sendmsg(msg)
else:
msg = 'Default Line Proxy Switch To BJ Fail!'
sendmsg(msg)
else:
''' BJ 节点探测异常'''
print '%s bj_proxy_alive: %s' %(ti, 'no!')
_of_bj += 1
if nj_proxy_alive():
''' NJ节点正常 '''
print '%s nj_proxy_alive: %s' % (ti, 'ok')
_nj_new = 'Up'
if _NJ_ON2DOWN != _nj_new:
''' NJ节点由down -> up '''
sendmsg('%s nj proxy is up!' % ti)
_NJ_ON2DOWN = _nj_new
if _NJ2NJ():
''' 默认线路切换到NJ节点 '''
msg = 'NJ Proxy Switch To NJ Success!'
sendmsg(msg)
else:
msg = 'NJ Proxy Switch To NJ Fail!'
sendmsg(msg)
else:
''' NJ节点探测异常 '''
print '%s nj_proxy_alive: %s' % (ti, 'no!')
_of_nj += 1
if _of_bj >= retry:
''' BJ节点down '''
msg = '%s bj proxy is down!' % ti
_of_bj = 0
print msg
_bj_new = 'Down'
if _BJ_ON2DOWN != _bj_new:
''' BJ节点 up --> down '''
sendmsg(msg)
_BJ_ON2DOWN = _bj_new
if _DEFAULT2NJ():
msg = 'Default Line Proxy Switch To NJ Success!'
sendmsg(msg)
else:
msg = 'Default Line Proxy Switch To NJ Fail!'
sendmsg(msg)
elif _of_nj >= retry:
''' NJ节点down '''
msg = '%s nj proxy is down!' % ti
_of_nj = 0
print msg
_nj_new = 'Down'
if _NJ_ON2DOWN != _nj_new:
''' NJ节点 up --> down '''
sendmsg(msg)
_NJ_ON2DOWN = _nj_new
if _NJ2BJ():
msg = 'NJ Proxy Switch To BJ Success!'
sendmsg(msg)
else:
msg = 'NJ Proxy Switch To BJ Fail!'
sendmsg(msg)
time.sleep(INTER)

其中 _NJ2BJ_NJ2NJ 是华东线路切换解析到BJ和NJ, _DEFAULT2NJ_DEFAULT2BJ是切换默认线路解析到NJ和BJ。

通过使用iptables在BJ和NJ的节点上测试,监控和解析切换都工作正常。

log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2016-09-22 02:27:54 bj proxy is up!
Default Line Proxy Switch To BJ Success!
2016-09-22 02:27:54 nj proxy is up!
NJ Proxy Switch To NJ Success!
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
2016-09-22 02:28:31 bj proxy is down!
Default Line Proxy Switch To NJ Success!
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
2016-09-22 02:28:46 bj proxy is up!
Default Line Proxy Switch To BJ Success!

完成code见这里

如果您觉得这篇文章对您有帮助,不妨支持我一下!