]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ================= |
2 | Troubleshooting | |
3 | ================= | |
4 | ||
5 | ||
6 | The Gateway Won't Start | |
7 | ======================= | |
8 | ||
9 | If you cannot start the gateway (i.e., there is no existing ``pid``), | |
10 | check to see if there is an existing ``.asok`` file from another | |
11 | user. If an ``.asok`` file from another user exists and there is no | |
12 | running ``pid``, remove the ``.asok`` file and try to start the | |
13 | process again. | |
14 | ||
15 | This may occur when you start the process as a ``root`` user and | |
16 | the startup script is trying to start the process as a | |
17 | ``www-data`` or ``apache`` user and an existing ``.asok`` is | |
18 | preventing the script from starting the daemon. | |
19 | ||
20 | The radosgw init script (/etc/init.d/radosgw) also has a verbose argument that | |
21 | can provide some insight as to what could be the issue: | |
22 | ||
23 | /etc/init.d/radosgw start -v | |
24 | ||
25 | or | |
26 | ||
27 | /etc/init.d radosgw start --verbose | |
28 | ||
29 | HTTP Request Errors | |
30 | =================== | |
31 | ||
32 | Examining the access and error logs for the web server itself is | |
33 | probably the first step in identifying what is going on. If there is | |
34 | a 500 error, that usually indicates a problem communicating with the | |
35 | ``radosgw`` daemon. Ensure the daemon is running, its socket path is | |
36 | configured, and that the web server is looking for it in the proper | |
37 | location. | |
38 | ||
39 | ||
40 | Crashed ``radosgw`` process | |
41 | =========================== | |
42 | ||
43 | If the ``radosgw`` process dies, you will normally see a 500 error | |
44 | from the web server (apache, nginx, etc.). In that situation, simply | |
45 | restarting radosgw will restore service. | |
46 | ||
47 | To diagnose the cause of the crash, check the log in ``/var/log/ceph`` | |
48 | and/or the core file (if one was generated). | |
49 | ||
50 | ||
51 | Blocked ``radosgw`` Requests | |
52 | ============================ | |
53 | ||
54 | If some (or all) radosgw requests appear to be blocked, you can get | |
55 | some insight into the internal state of the ``radosgw`` daemon via | |
56 | its admin socket. By default, there will be a socket configured to | |
57 | reside in ``/var/run/ceph``, and the daemon can be queried with:: | |
58 | ||
59 | ceph daemon /var/run/ceph/client.rgw help | |
60 | ||
61 | help list available commands | |
62 | objecter_requests show in-progress osd requests | |
63 | perfcounters_dump dump perfcounters value | |
64 | perfcounters_schema dump perfcounters schema | |
65 | version get protocol version | |
66 | ||
67 | Of particular interest:: | |
68 | ||
69 | ceph daemon /var/run/ceph/client.rgw objecter_requests | |
70 | ... | |
71 | ||
72 | will dump information about current in-progress requests with the | |
73 | RADOS cluster. This allows one to identify if any requests are blocked | |
74 | by a non-responsive OSD. For example, one might see:: | |
75 | ||
76 | { "ops": [ | |
77 | { "tid": 1858, | |
78 | "pg": "2.d2041a48", | |
79 | "osd": 1, | |
80 | "last_sent": "2012-03-08 14:56:37.949872", | |
81 | "attempts": 1, | |
82 | "object_id": "fatty_25647_object1857", | |
83 | "object_locator": "@2", | |
84 | "snapid": "head", | |
85 | "snap_context": "0=[]", | |
86 | "mtime": "2012-03-08 14:56:37.949813", | |
87 | "osd_ops": [ | |
88 | "write 0~4096"]}, | |
89 | { "tid": 1873, | |
90 | "pg": "2.695e9f8e", | |
91 | "osd": 1, | |
92 | "last_sent": "2012-03-08 14:56:37.970615", | |
93 | "attempts": 1, | |
94 | "object_id": "fatty_25647_object1872", | |
95 | "object_locator": "@2", | |
96 | "snapid": "head", | |
97 | "snap_context": "0=[]", | |
98 | "mtime": "2012-03-08 14:56:37.970555", | |
99 | "osd_ops": [ | |
100 | "write 0~4096"]}], | |
101 | "linger_ops": [], | |
102 | "pool_ops": [], | |
103 | "pool_stat_ops": [], | |
104 | "statfs_ops": []} | |
105 | ||
106 | In this dump, two requests are in progress. The ``last_sent`` field is | |
107 | the time the RADOS request was sent. If this is a while ago, it suggests | |
108 | that the OSD is not responding. For example, for request 1858, you could | |
109 | check the OSD status with:: | |
110 | ||
111 | ceph pg map 2.d2041a48 | |
112 | ||
113 | osdmap e9 pg 2.d2041a48 (2.0) -> up [1,0] acting [1,0] | |
114 | ||
115 | This tells us to look at ``osd.1``, the primary copy for this PG:: | |
116 | ||
117 | ceph daemon osd.1 ops | |
118 | { "num_ops": 651, | |
119 | "ops": [ | |
120 | { "description": "osd_op(client.4124.0:1858 fatty_25647_object1857 [write 0~4096] 2.d2041a48)", | |
121 | "received_at": "1331247573.344650", | |
122 | "age": "25.606449", | |
123 | "flag_point": "waiting for sub ops", | |
124 | "client_info": { "client": "client.4124", | |
125 | "tid": 1858}}, | |
126 | ... | |
127 | ||
128 | The ``flag_point`` field indicates that the OSD is currently waiting | |
129 | for replicas to respond, in this case ``osd.0``. | |
130 | ||
131 | ||
132 | Java S3 API Troubleshooting | |
133 | =========================== | |
134 | ||
135 | ||
136 | Peer Not Authenticated | |
137 | ---------------------- | |
138 | ||
139 | You may receive an error that looks like this:: | |
140 | ||
141 | [java] INFO: Unable to execute HTTP request: peer not authenticated | |
142 | ||
143 | The Java SDK for S3 requires a valid certificate from a recognized certificate | |
144 | authority, because it uses HTTPS by default. If you are just testing the Ceph | |
145 | Object Storage services, you can resolve this problem in a few ways: | |
146 | ||
147 | #. Prepend the IP address or hostname with ``http://``. For example, change this:: | |
148 | ||
149 | conn.setEndpoint("myserver"); | |
150 | ||
151 | To:: | |
152 | ||
153 | conn.setEndpoint("http://myserver") | |
154 | ||
155 | #. After setting your credentials, add a client configuration and set the | |
156 | protocol to ``Protocol.HTTP``. :: | |
157 | ||
158 | AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey); | |
159 | ||
160 | ClientConfiguration clientConfig = new ClientConfiguration(); | |
161 | clientConfig.setProtocol(Protocol.HTTP); | |
162 | ||
163 | AmazonS3 conn = new AmazonS3Client(credentials, clientConfig); | |
164 | ||
165 | ||
166 | ||
167 | 405 MethodNotAllowed | |
168 | -------------------- | |
169 | ||
170 | If you receive an 405 error, check to see if you have the S3 subdomain set up correctly. | |
171 | You will need to have a wild card setting in your DNS record for subdomain functionality | |
172 | to work properly. | |
173 | ||
174 | Also, check to ensure that the default site is disabled. :: | |
175 | ||
176 | [java] Exception in thread "main" Status Code: 405, AWS Service: Amazon S3, AWS Request ID: null, AWS Error Code: MethodNotAllowed, AWS Error Message: null, S3 Extended Request ID: null | |
177 | ||
178 | ||
179 |