Few-NERD

About Few-NERD

Few-NERD is a large-scale, fine-grained manually annotated named entity recognition dataset, which contains 8 coarse-grained types, 66 fine-grained types, 188,200 sentences, 491,711 entities and 4,601,223 tokens. Three benchmark tasks are built, one is supervised (Few-NERD (SUP)) and the other two are few-shot (Few-NERD (INTRA) and Few-NERD (INTER)). Few-NERD is collected by researchers from Tsinghua University and DAMO Academy, Alibaba Group .

For more details about Few-NERD, please refer to our ACL-IJCNLP 2021 paper:

Getting started

Raw Datasets

Few-NERD is distributed under CC BY-SA 4.0 license , download Few-NERD raw datasets by following links:

Few-NERD (SUP) (14 MB) Few-NERD (INTRA) (12 MB) Few-NERD (INTER) (12 MB)

Sampled Datasets

As the sampling strategy has considerable impact in few-shot learning, thus we also release a data sampled by us (using the util/fewshotsampler.py in our code). The files of these sampled data are named such train/dev/test_N_K.json. We sampled 20000, 1000, 5000 episodes for train, dev, test, respectively. The results of the paper and the leaderboard are produced by this data. Sampled Few-NERD (568 MB)

Check out the Github repository for a comprehensive guide to use Few-NERD.

About the Leaderboard

To facilitate diversified research about named entities, we release all the data (including the test set) of the three tasks. We encourage the community to do research beyond these settings (such as open/ unsupervised/ continual NER or entity typing/ linking, etc). Enjoy!
But we still maintain a leaderboard to record the peer-reviewd results.

Connection

If you have any questions about Few-NERD, or you want to update the leaderboard, feel free to email to the authors:

If you use Few-NERD in your work, please cite the paper:

@inproceedings{ding2021few,
  title={Few-NERD:A Few-shot Named Entity Recognition Dataset},
  author={Ding, Ning and Xu, Guangwei and Chen, Yulin, and Wang, Xiaobin and Han, Xu and Xie, Pengjun and Zheng, Hai-Tao and Liu, Zhiyuan},
  booktitle={ACL-IJCNLP},
  year={2021}
}

The *supervised* setting is a standard NER task.
	Model	Code	Precision	Recall	F1-Measure
	Model	Code	1 Feb 27, 2021	BERT-Tagger Few-NERD paper		65.56	68.78	67.13

The *few-shot (intra)* setting is a few-shot NER task across different coarse-grained types.
	Model	Code	5 way 1~2 shot	5 way 5~10 shot	10 way 1~2 shot	10 way 5~10 shot	Avg
	Model	Code	1 Feb 27, 2021	StructShot Few-NERD paper		30.21	38.00	21.03	26.42	28.92
1 Feb 27, 2021	ProtoBERT Few-NERD paper		20.76	42.54	15.05	35.40	28.43
1 Feb 27, 2021	NNShot Few-NERD paper		25.78	36.18	18.27	27.38	26.90

The *few-shot (inter)* setting is a few-shot NER task within coarse-grained types.
	Model	Code	5 way 1~2 shot	5 way 5~10 shot	10 way 1~2 shot	10 way 5~10 shot	Avg
	Model	Code	1 Feb 27, 2021	StructShot Few-NERD paper		51.88	57.32	43.34	49.57	50.53
1 Feb 27, 2021	NNShot Few-NERD paper		47.24	55.64	38.87	49.57	47.83
1 Feb 27, 2021	ProtoBERT Few-NERD paper		38.83	58.79	32.34	52.92	45.72