Hadoop MapReduce处理海量小文件:自定义InputFormat和RecordReader

一般来说,基于Hadoop的MapReduce框架来处理数据,主要是面向海量大数据,对于这类数据,Hadoop能够使其真正发挥其能力。对于海量小文件,不是说不能使用Hadoop来处理,只不过直接进行处理效率不会高,而且海量的小文件对于HDFS的架构设计来说,会占用NameNode大量的内存来保存文件的元数据(Bookkeeping)。另外,由于文件比较小,我们是指远远小于HDFS默认Block大小(64M),比如1k~2M,都很小了,在进行运算的时候,可能无法最大限度地充分Locality特性带来的优势,导致大量的数据在集群中传输,开销很大。
但是,实际应用中,也存在类似的场景,海量的小文件的处理需求也大量存在。那么,我们在使用Hadoop进行计算的时候,需要考虑将小数据转换成大数据,比如通过合并压缩等方法,可以使其在一定程度上,能够提高使用Hadoop集群计算方式的适应性。Hadoop也内置了一些解决方法,而且提供的API,可以很方便地实现。
下面,我们通过自定义InputFormat和RecordReader来实现对海量小文件的并行处理。
基本思路描述如下:
在Mapper中将小文件合并,输出结果的文件中每行由两部分组成,一部分是小文件名称,另一部分是该小文件的内容。

编程实现

我们实现一个WholeFileInputFormat,用来控制Mapper的输入规格,其中对于输入过程中处理文本行的读取使用的是自定义的WholeFileRecordReader。当Map任务执行完成后,我们直接将Map的输出原样输出到HDFS中,使用了一个最简单的IdentityReducer。
现在,看一下我们需要实现哪些内容:

  1. 读取每个小文件内容的WholeFileRecordReader
  2. 定义输入小文件的规格描述WholeFileInputFormat
  3. 用来合并小文件的Mapper实现WholeSmallfilesMapper
  4. 输出合并后的文件Reducer实现IdentityReducer
  5. 配置运行将多个小文件合并成一个大文件

接下来,详细描述上面的几点内容。

  • WholeFileRecordReader类

输入的键值对类型,对小文件,每个文件对应一个InputSplit,我们读取这个InputSplit实际上就是具有一个Block的整个文件的内容,将整个文件的内容读取到BytesWritable,也就是一个字节数组。

package org.shirdrn.kodz.inaction.hadoop.smallfiles.whole;

import java.io.IOException;

import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;

public class WholeFileRecordReader extends RecordReader<NullWritable, BytesWritable> {

	private FileSplit fileSplit;
	private JobContext jobContext;
	private NullWritable currentKey = NullWritable.get();
	private BytesWritable currentValue;
	private boolean finishConverting = false;

	@Override
	public NullWritable getCurrentKey() throws IOException, InterruptedException {
		return currentKey;
	}

	@Override
	public BytesWritable getCurrentValue() throws IOException, InterruptedException {
		return currentValue;
	}

	@Override
	public void initialize(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException {
		this.fileSplit = (FileSplit) split;
		this.jobContext = context;
		context.getConfiguration().set("map.input.file", fileSplit.getPath().getName());
	}

	@Override
	public boolean nextKeyValue() throws IOException, InterruptedException {
		if (!finishConverting) {
			currentValue = new BytesWritable();
			int len = (int) fileSplit.getLength();
			byte[] content = new byte[len];
			Path file = fileSplit.getPath();
			FileSystem fs = file.getFileSystem(jobContext.getConfiguration());
			FSDataInputStream in = null;
			try {
				in = fs.open(file);
				IOUtils.readFully(in, content, 0, len);
				currentValue.set(content, 0, len);
			} finally {
				if (in != null) {
					IOUtils.closeStream(in);
				}
			}
			finishConverting = true;
			return true;
		}
		return false;
	}

	@Override
	public float getProgress() throws IOException {
		float progress = 0;
		if (finishConverting) {
			progress = 1;
		}
		return progress;
	}

	@Override
	public void close() throws IOException {
		// TODO Auto-generated method stub

	}
}

实现RecordReader接口,最核心的就是处理好迭代多行文本的内容的逻辑,每次迭代通过调用nextKeyValue()方法来判断是否还有可读的文本行,直接设置当前的Key和Value,分别在方法getCurrentKey()和getCurrentValue()中返回对应的值。
另外,我们设置了”map.input.file”的值是文件名称,以便在Map任务中取出并将文件名称作为键写入到输出。

  • WholeFileInputFormat类
package org.shirdrn.kodz.inaction.hadoop.smallfiles.whole;

import java.io.IOException;

import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

public class WholeFileInputFormat extends FileInputFormat<NullWritable, BytesWritable> {

	@Override
	public RecordReader<NullWritable, BytesWritable> createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException {
		RecordReader<NullWritable, BytesWritable> recordReader = new WholeFileRecordReader();
		recordReader.initialize(split, context);
		return recordReader;
	}
}

这个类实现比较简单,继承自FileInputFormat后需要实现createRecordReader()方法,返回用来读文件记录的RecordReader,直接使用前面实现的WholeFileRecordReader创建一个实例,然后调用initialize()方法进行初始化。

  • WholeSmallfilesMapper
package org.shirdrn.kodz.inaction.hadoop.smallfiles.whole;

import java.io.IOException;

import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class WholeSmallfilesMapper extends Mapper<NullWritable, BytesWritable, Text, BytesWritable> {

	private Text file = new Text();

	@Override
	protected void map(NullWritable key, BytesWritable value, Context context) throws IOException, InterruptedException {
		String fileName = context.getConfiguration().get("map.input.file");
		file.set(fileName);
		context.write(file, value);
	}
}
  • IdentityReducer类
package org.shirdrn.kodz.inaction.hadoop.smallfiles;

import java.io.IOException;

import org.apache.hadoop.mapreduce.Reducer;

public class IdentityReducer<Text, BytesWritable> extends Reducer<Text, BytesWritable, Text, BytesWritable> {

	@Override
	protected void reduce(Text key, Iterable<BytesWritable> values, Context context) throws IOException, InterruptedException {
		for (BytesWritable value : values) {
			context.write(key, value);
		}
	}
}

这个是Reduce任务的实现,只是将Map任务的输出原样写入到HDFS中。

  • WholeCombinedSmallfiles
package org.shirdrn.kodz.inaction.hadoop.smallfiles.whole;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.shirdrn.kodz.inaction.hadoop.smallfiles.IdentityReducer;

public class WholeCombinedSmallfiles {

	public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

		Configuration conf = new Configuration();
		String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
		if (otherArgs.length != 2) {
			System.err.println("Usage: conbinesmallfiles <in> <out>");
			System.exit(2);
		}

		Job job = new Job(conf, "combine smallfiles");

		job.setJarByClass(WholeCombinedSmallfiles.class);
		job.setMapperClass(WholeSmallfilesMapper.class);
		job.setReducerClass(IdentityReducer.class);

		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(BytesWritable.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(BytesWritable.class);

		job.setInputFormatClass(WholeFileInputFormat.class);
		job.setOutputFormatClass(SequenceFileOutputFormat.class);

		job.setNumReduceTasks(5);

		FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
		FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

		int exitFlag = job.waitForCompletion(true) ? 0 : 1;
		System.exit(exitFlag);
	}

}

这是是程序的入口,主要是对MapReduce任务进行配置,只需要设置好对应的配置即可。我们设置了5个Reduce任务,最终会有5个输出结果文件。
这里,我们的Reduce任务执行的输出格式为SequenceFileOutputFormat定义的,就是SequenceFile,二进制文件。

运行程序

  • 准备工作
jar -cvf combine-smallfiles.jar -C ./ org/shirdrn/kodz/inaction/hadoop/smallfiles
xiaoxiang@ubuntu3:~$ cd /opt/stone/cloud/hadoop-1.0.3
xiaoxiang@ubuntu3:/opt/stone/cloud/hadoop-1.0.3$ bin/hadoop fs -mkdir /user/xiaoxiang/datasets/smallfiles
xiaoxiang@ubuntu3:/opt/stone/cloud/hadoop-1.0.3$ bin/hadoop fs -copyFromLocal /opt/stone/cloud/dataset/smallfiles/* /user/xiaoxiang/datasets/smallfiles
  • 运行MapReduce程序
xiaoxiang@ubuntu3:/opt/stone/cloud/hadoop-1.0.3$ bin/hadoop jar combine-smallfiles.jar org.shirdrn.kodz.inaction.hadoop.smallfiles.whole.WholeCombinedSmallfiles /user/xiaoxiang/datasets/smallfiles /user/xiaoxiang/output/smallfiles/whole
13/03/23 14:09:24 INFO input.FileInputFormat: Total input paths to process : 117
13/03/23 14:09:24 INFO mapred.JobClient: Running job: job_201303111631_0016
13/03/23 14:09:25 INFO mapred.JobClient:  map 0% reduce 0%
13/03/23 14:09:40 INFO mapred.JobClient:  map 1% reduce 0%
13/03/23 14:09:46 INFO mapred.JobClient:  map 3% reduce 0%
13/03/23 14:09:52 INFO mapred.JobClient:  map 5% reduce 0%
13/03/23 14:09:58 INFO mapred.JobClient:  map 6% reduce 0%
13/03/23 14:10:04 INFO mapred.JobClient:  map 8% reduce 0%
13/03/23 14:10:10 INFO mapred.JobClient:  map 10% reduce 0%
13/03/23 14:10:13 INFO mapred.JobClient:  map 10% reduce 1%
13/03/23 14:10:16 INFO mapred.JobClient:  map 11% reduce 1%
13/03/23 14:10:22 INFO mapred.JobClient:  map 13% reduce 1%
13/03/23 14:10:28 INFO mapred.JobClient:  map 15% reduce 1%
13/03/23 14:10:34 INFO mapred.JobClient:  map 17% reduce 1%
13/03/23 14:10:40 INFO mapred.JobClient:  map 18% reduce 2%
13/03/23 14:10:46 INFO mapred.JobClient:  map 20% reduce 2%
13/03/23 14:10:52 INFO mapred.JobClient:  map 22% reduce 2%
13/03/23 14:10:58 INFO mapred.JobClient:  map 23% reduce 2%
13/03/23 14:11:04 INFO mapred.JobClient:  map 25% reduce 3%
13/03/23 14:11:10 INFO mapred.JobClient:  map 27% reduce 3%
13/03/23 14:11:16 INFO mapred.JobClient:  map 29% reduce 3%
13/03/23 14:11:22 INFO mapred.JobClient:  map 30% reduce 3%
13/03/23 14:11:28 INFO mapred.JobClient:  map 32% reduce 3%
13/03/23 14:11:34 INFO mapred.JobClient:  map 34% reduce 4%
13/03/23 14:11:40 INFO mapred.JobClient:  map 35% reduce 4%
13/03/23 14:11:46 INFO mapred.JobClient:  map 37% reduce 4%
13/03/23 14:11:52 INFO mapred.JobClient:  map 39% reduce 4%
13/03/23 14:11:58 INFO mapred.JobClient:  map 41% reduce 5%
13/03/23 14:12:04 INFO mapred.JobClient:  map 42% reduce 5%
13/03/23 14:12:10 INFO mapred.JobClient:  map 44% reduce 5%
13/03/23 14:12:16 INFO mapred.JobClient:  map 46% reduce 5%
13/03/23 14:12:22 INFO mapred.JobClient:  map 47% reduce 5%
13/03/23 14:12:25 INFO mapred.JobClient:  map 47% reduce 6%
13/03/23 14:12:28 INFO mapred.JobClient:  map 49% reduce 6%
13/03/23 14:12:34 INFO mapred.JobClient:  map 51% reduce 6%
13/03/23 14:12:40 INFO mapred.JobClient:  map 52% reduce 6%
13/03/23 14:12:46 INFO mapred.JobClient:  map 54% reduce 7%
13/03/23 14:12:52 INFO mapred.JobClient:  map 56% reduce 7%
13/03/23 14:12:58 INFO mapred.JobClient:  map 58% reduce 7%
13/03/23 14:13:04 INFO mapred.JobClient:  map 59% reduce 7%
13/03/23 14:13:10 INFO mapred.JobClient:  map 61% reduce 7%
13/03/23 14:13:13 INFO mapred.JobClient:  map 61% reduce 8%
13/03/23 14:13:16 INFO mapred.JobClient:  map 63% reduce 8%
13/03/23 14:13:22 INFO mapred.JobClient:  map 64% reduce 8%
13/03/23 14:13:28 INFO mapred.JobClient:  map 66% reduce 8%
13/03/23 14:13:34 INFO mapred.JobClient:  map 68% reduce 8%
13/03/23 14:13:40 INFO mapred.JobClient:  map 70% reduce 9%
13/03/23 14:13:46 INFO mapred.JobClient:  map 71% reduce 9%
13/03/23 14:13:52 INFO mapred.JobClient:  map 73% reduce 9%
13/03/23 14:13:58 INFO mapred.JobClient:  map 75% reduce 9%
13/03/23 14:14:04 INFO mapred.JobClient:  map 76% reduce 9%
13/03/23 14:14:10 INFO mapred.JobClient:  map 78% reduce 10%
13/03/23 14:14:16 INFO mapred.JobClient:  map 80% reduce 10%
13/03/23 14:14:22 INFO mapred.JobClient:  map 82% reduce 10%
13/03/23 14:14:28 INFO mapred.JobClient:  map 83% reduce 10%
13/03/23 14:14:34 INFO mapred.JobClient:  map 85% reduce 10%
13/03/23 14:14:37 INFO mapred.JobClient:  map 85% reduce 11%
13/03/23 14:14:40 INFO mapred.JobClient:  map 87% reduce 11%
13/03/23 14:14:46 INFO mapred.JobClient:  map 88% reduce 11%
13/03/23 14:14:52 INFO mapred.JobClient:  map 90% reduce 11%
13/03/23 14:14:58 INFO mapred.JobClient:  map 92% reduce 12%
13/03/23 14:15:04 INFO mapred.JobClient:  map 94% reduce 12%
13/03/23 14:15:10 INFO mapred.JobClient:  map 95% reduce 12%
13/03/23 14:15:16 INFO mapred.JobClient:  map 97% reduce 12%
13/03/23 14:15:22 INFO mapred.JobClient:  map 99% reduce 12%
13/03/23 14:15:28 INFO mapred.JobClient:  map 100% reduce 13%
13/03/23 14:15:37 INFO mapred.JobClient:  map 100% reduce 26%
13/03/23 14:15:40 INFO mapred.JobClient:  map 100% reduce 39%
13/03/23 14:15:49 INFO mapred.JobClient:  map 100% reduce 59%
13/03/23 14:15:52 INFO mapred.JobClient:  map 100% reduce 79%
13/03/23 14:15:58 INFO mapred.JobClient:  map 100% reduce 100%
13/03/23 14:16:03 INFO mapred.JobClient: Job complete: job_201303111631_0016
13/03/23 14:16:03 INFO mapred.JobClient: Counters: 29
13/03/23 14:16:03 INFO mapred.JobClient:   Job Counters
13/03/23 14:16:03 INFO mapred.JobClient:     Launched reduce tasks=5
13/03/23 14:16:03 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=491322
13/03/23 14:16:03 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/03/23 14:16:03 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/03/23 14:16:03 INFO mapred.JobClient:     Launched map tasks=117
13/03/23 14:16:03 INFO mapred.JobClient:     Data-local map tasks=117
13/03/23 14:16:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=719836
13/03/23 14:16:03 INFO mapred.JobClient:   File Output Format Counters
13/03/23 14:16:03 INFO mapred.JobClient:     Bytes Written=147035685
13/03/23 14:16:03 INFO mapred.JobClient:   FileSystemCounters
13/03/23 14:16:03 INFO mapred.JobClient:     FILE_BYTES_READ=147032689
13/03/23 14:16:03 INFO mapred.JobClient:     HDFS_BYTES_READ=147045529
13/03/23 14:16:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=296787727
13/03/23 14:16:03 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=147035685
13/03/23 14:16:03 INFO mapred.JobClient:   File Input Format Counters
13/03/23 14:16:03 INFO mapred.JobClient:     Bytes Read=147029851
13/03/23 14:16:03 INFO mapred.JobClient:   Map-Reduce Framework
13/03/23 14:16:03 INFO mapred.JobClient:     Map output materialized bytes=147036169
13/03/23 14:16:03 INFO mapred.JobClient:     Map input records=117
13/03/23 14:16:03 INFO mapred.JobClient:     Reduce shuffle bytes=145779618
13/03/23 14:16:03 INFO mapred.JobClient:     Spilled Records=234
13/03/23 14:16:03 INFO mapred.JobClient:     Map output bytes=147032074
13/03/23 14:16:03 INFO mapred.JobClient:     CPU time spent (ms)=79550
13/03/23 14:16:03 INFO mapred.JobClient:     Total committed heap usage (bytes)=19630391296
13/03/23 14:16:03 INFO mapred.JobClient:     Combine input records=0
13/03/23 14:16:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=15678
13/03/23 14:16:03 INFO mapred.JobClient:     Reduce input records=117
13/03/23 14:16:03 INFO mapred.JobClient:     Reduce input groups=117
13/03/23 14:16:03 INFO mapred.JobClient:     Combine output records=0
13/03/23 14:16:03 INFO mapred.JobClient:     Physical memory (bytes) snapshot=20658409472
13/03/23 14:16:03 INFO mapred.JobClient:     Reduce output records=117
13/03/23 14:16:03 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=65064620032
13/03/23 14:16:03 INFO mapred.JobClient:     Map output records=117
  • 验证程序运行结果
xiaoxiang@ubuntu3:/opt/stone/cloud/hadoop-1.0.3$ bin/hadoop fs -ls /user/xiaoxiang/output/smallfiles/whole
Found 7 items
-rw-r--r--   3 xiaoxiang supergroup          0 2013-03-23 14:15 /user/xiaoxiang/output/smallfiles/whole/_SUCCESS
drwxr-xr-x   - xiaoxiang supergroup          0 2013-03-23 14:09 /user/xiaoxiang/output/smallfiles/whole/_logs
-rw-r--r--   3 xiaoxiang supergroup   30161482 2013-03-23 14:15 /user/xiaoxiang/output/smallfiles/whole/part-r-00000
-rw-r--r--   3 xiaoxiang supergroup   30160646 2013-03-23 14:15 /user/xiaoxiang/output/smallfiles/whole/part-r-00001
-rw-r--r--   3 xiaoxiang supergroup   27647901 2013-03-23 14:15 /user/xiaoxiang/output/smallfiles/whole/part-r-00002
-rw-r--r--   3 xiaoxiang supergroup   30161567 2013-03-23 14:15 /user/xiaoxiang/output/smallfiles/whole/part-r-00003
-rw-r--r--   3 xiaoxiang supergroup   28904089 2013-03-23 14:15 /user/xiaoxiang/output/smallfiles/whole/part-r-00004

xiaoxiang@ubuntu3:/opt/stone/cloud/hadoop-1.0.3$ bin/hadoop fs -text /user/xiaoxiang/output/smallfiles/whole/part-r-00000 | cut -d" " -f 1
data_50000_000     53
data_50000_005     4c
data_50000_014     47
data_50000_019     47
data_50000_023     50
data_50000_028     54
data_50000_032     45
data_50000_037     55
data_50000_041     4e
data_50000_046     4d
data_50000_050     4c
data_50000_055     55
data_50000_064     54
data_50000_069     42
data_50000_073     48
data_50000_078     54
data_50000_082     42
data_50000_087     53
data_50000_091     43
data_50000_096     41
data_50000_203     4d
data_50000_208     49
data_50000_212     48
data_50000_230     46

可以看到,Reducer阶段生成了5个文件,他们都是将小文件合并后的得到的大文件,如果需要对这些文件进行其他处理,可以再实现满足实际处理的Mapper,将输入路径指定的前面Reducer的输出路径即可。这样一来,对于大量小文件的处理,转换成了数个大文件的处理,就能够充分利用Hadoop MapReduce计算集群的优势。

Creative Commons License

本文基于署名-非商业性使用-相同方式共享 4.0许可协议发布,欢迎转载、使用、重新发布,但务必保留文章署名时延军(包含链接:http://shiyanjun.cn),不得用于商业目的,基于本文修改后的作品务必以相同的许可发布。如有任何疑问,请与我联系

评论(10): “Hadoop MapReduce处理海量小文件:自定义InputFormat和RecordReader

  1. 我觉得这么做有问题的。mapper与每一个split对应,即你每一个小文件都要交给一个mapper来处理,这样的话虽然达到了合成大文件的目的,但是你在合并过程中造成了大量的空间浪费和资源开销嘛

    • 你说的没问题,选择这种方式去处理,在一些特殊的场景中会比较合适(当然,有更好的方案我们肯定会选择好的)。比如,后续的计算非常复杂,使用这种方式做一个预处理,会为后面更复杂的计算节省空间或时间资源。确实,不推荐使用这种方式处理。

  2. WholeFileInputFormat 为什么没有覆盖isSplitable()方法呢,上面的代码使用的是默认的split策略,如果单个文件大于64mb 那么WholeSmallfilesMapper 类一次拿到的value就不是整个文件了吧,刚接触hadoop不久,不知道说的对不对?

  3. 你好 麻烦你能讲解一下
    这些小文件上传的时候不是要分块的吗?但是每个小文件大小又不一样,你分块的大小可以随小文件的大小可改变吗?求教! 方便的话能联系一下吗?我看了你的另外一篇文章也是讲解这个的。感觉讲的很好想请教你一下 我的qq是2745270681 谢谢

    • 小文件大小,小于一个Block大小,那这个小文件就是一个Block。如果每个小文件大小都不相同,那么分块后大小自然不同了。

  4. Pingback: mapreduce中map方法一次读取整个文件-IT大道

  5. 那如果是mapreduce 有一个文件夹的下得多个文档需要处理 但是需要单独处理怎么办呢? 意思是每读取一个文档做一次Mapreduce然后输出 然后处理下一个文档

发表评论

电子邮件地址不会被公开。 必填项已用*标注

您可以使用这些HTML标签和属性: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>