]> git.proxmox.com Git - ceph.git/blame - ceph/doc/radosgw/troubleshooting.rst
add subtree-ish sources for 12.0.3
[ceph.git] / ceph / doc / radosgw / troubleshooting.rst
CommitLineData
7c673cae
FG
1=================
2 Troubleshooting
3=================
4
5
6The Gateway Won't Start
7=======================
8
9If you cannot start the gateway (i.e., there is no existing ``pid``),
10check to see if there is an existing ``.asok`` file from another
11user. If an ``.asok`` file from another user exists and there is no
12running ``pid``, remove the ``.asok`` file and try to start the
13process again.
14
15This may occur when you start the process as a ``root`` user and
16the startup script is trying to start the process as a
17``www-data`` or ``apache`` user and an existing ``.asok`` is
18preventing the script from starting the daemon.
19
20The radosgw init script (/etc/init.d/radosgw) also has a verbose argument that
21can provide some insight as to what could be the issue:
22
23 /etc/init.d/radosgw start -v
24
25or
26
27 /etc/init.d radosgw start --verbose
28
29HTTP Request Errors
30===================
31
32Examining the access and error logs for the web server itself is
33probably the first step in identifying what is going on. If there is
34a 500 error, that usually indicates a problem communicating with the
35``radosgw`` daemon. Ensure the daemon is running, its socket path is
36configured, and that the web server is looking for it in the proper
37location.
38
39
40Crashed ``radosgw`` process
41===========================
42
43If the ``radosgw`` process dies, you will normally see a 500 error
44from the web server (apache, nginx, etc.). In that situation, simply
45restarting radosgw will restore service.
46
47To diagnose the cause of the crash, check the log in ``/var/log/ceph``
48and/or the core file (if one was generated).
49
50
51Blocked ``radosgw`` Requests
52============================
53
54If some (or all) radosgw requests appear to be blocked, you can get
55some insight into the internal state of the ``radosgw`` daemon via
56its admin socket. By default, there will be a socket configured to
57reside in ``/var/run/ceph``, and the daemon can be queried with::
58
59 ceph daemon /var/run/ceph/client.rgw help
60
61 help list available commands
62 objecter_requests show in-progress osd requests
63 perfcounters_dump dump perfcounters value
64 perfcounters_schema dump perfcounters schema
65 version get protocol version
66
67Of particular interest::
68
69 ceph daemon /var/run/ceph/client.rgw objecter_requests
70 ...
71
72will dump information about current in-progress requests with the
73RADOS cluster. This allows one to identify if any requests are blocked
74by a non-responsive OSD. For example, one might see::
75
76 { "ops": [
77 { "tid": 1858,
78 "pg": "2.d2041a48",
79 "osd": 1,
80 "last_sent": "2012-03-08 14:56:37.949872",
81 "attempts": 1,
82 "object_id": "fatty_25647_object1857",
83 "object_locator": "@2",
84 "snapid": "head",
85 "snap_context": "0=[]",
86 "mtime": "2012-03-08 14:56:37.949813",
87 "osd_ops": [
88 "write 0~4096"]},
89 { "tid": 1873,
90 "pg": "2.695e9f8e",
91 "osd": 1,
92 "last_sent": "2012-03-08 14:56:37.970615",
93 "attempts": 1,
94 "object_id": "fatty_25647_object1872",
95 "object_locator": "@2",
96 "snapid": "head",
97 "snap_context": "0=[]",
98 "mtime": "2012-03-08 14:56:37.970555",
99 "osd_ops": [
100 "write 0~4096"]}],
101 "linger_ops": [],
102 "pool_ops": [],
103 "pool_stat_ops": [],
104 "statfs_ops": []}
105
106In this dump, two requests are in progress. The ``last_sent`` field is
107the time the RADOS request was sent. If this is a while ago, it suggests
108that the OSD is not responding. For example, for request 1858, you could
109check the OSD status with::
110
111 ceph pg map 2.d2041a48
112
113 osdmap e9 pg 2.d2041a48 (2.0) -> up [1,0] acting [1,0]
114
115This tells us to look at ``osd.1``, the primary copy for this PG::
116
117 ceph daemon osd.1 ops
118 { "num_ops": 651,
119 "ops": [
120 { "description": "osd_op(client.4124.0:1858 fatty_25647_object1857 [write 0~4096] 2.d2041a48)",
121 "received_at": "1331247573.344650",
122 "age": "25.606449",
123 "flag_point": "waiting for sub ops",
124 "client_info": { "client": "client.4124",
125 "tid": 1858}},
126 ...
127
128The ``flag_point`` field indicates that the OSD is currently waiting
129for replicas to respond, in this case ``osd.0``.
130
131
132Java S3 API Troubleshooting
133===========================
134
135
136Peer Not Authenticated
137----------------------
138
139You may receive an error that looks like this::
140
141 [java] INFO: Unable to execute HTTP request: peer not authenticated
142
143The Java SDK for S3 requires a valid certificate from a recognized certificate
144authority, because it uses HTTPS by default. If you are just testing the Ceph
145Object Storage services, you can resolve this problem in a few ways:
146
147#. Prepend the IP address or hostname with ``http://``. For example, change this::
148
149 conn.setEndpoint("myserver");
150
151 To::
152
153 conn.setEndpoint("http://myserver")
154
155#. After setting your credentials, add a client configuration and set the
156 protocol to ``Protocol.HTTP``. ::
157
158 AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
159
160 ClientConfiguration clientConfig = new ClientConfiguration();
161 clientConfig.setProtocol(Protocol.HTTP);
162
163 AmazonS3 conn = new AmazonS3Client(credentials, clientConfig);
164
165
166
167405 MethodNotAllowed
168--------------------
169
170If you receive an 405 error, check to see if you have the S3 subdomain set up correctly.
171You will need to have a wild card setting in your DNS record for subdomain functionality
172to work properly.
173
174Also, check to ensure that the default site is disabled. ::
175
176 [java] Exception in thread "main" Status Code: 405, AWS Service: Amazon S3, AWS Request ID: null, AWS Error Code: MethodNotAllowed, AWS Error Message: null, S3 Extended Request ID: null
177
178
179